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Abstract 

We review the formalism and applications of the halo-based description of nonlin- 
ear gravitational clustering. In this approach, all mass is associated with virialized 
dark matter halos; models of the number and spatial distribution of the halos, and 
the distribution of dark matter within each halo, are used to provide estimates of 
how the statistical properties of large scale density and velocity fields evolve as a 
result of nonlinear gravitational clustering. We first describe the model, and demon- 
strate its accuracy by comparing its predictions with exact results from numerical 
simulations of nonlinear gravitational clustering. We then present several astrophys- 
ical applications of the halo model: these include models of the spatial distribution 
of galaxies, the nonlinear velocity, momentum and pressure fields, descriptions of 
weak gravitational lensing, and estimates of secondary contributions to temperature 
fluctuations in the cosmic microwave background. 
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Fig. 1. The complex distribution of dark matter (a) found in numerical simulations 
can be easily replaced with a distribution of dark matter halos (b) with the mass 
function following that found in simulations and with a profile for dark matter 
within halos. 

1 Introduction 

This review presents astrophysical applications of an approach which has its 
origins in papers by Jerzy Neyman & Elizabeth Scott and their collaborators 
nearly fifty years ago. Neyman & Scott [199] were interested in describing 
the spatial distribution of galaxies. They argued that it was useful to think 
of the galaxy distribution as being made up of distinct clusters with a range 
of sizes. Since galaxies are discrete objects, they described how to study sta- 
tistical properties of a distribution of discrete points; the description required 
knowledge of the distribution of cluster sizes, the distribution of points around 
the cluster center, and a description of the clustering of the clusters [199]. At 
that time, none of these ingredients were known, and so in subsequent work 
[200,201], they focussed on inferring these parameters from data which was 
just becoming useful for statistical studies. 

Since that time, it has become clear that much of the mass in the Universe 
is dark, and that this mass was initially rather smoothly distributed. There- 
fore, the luminous galaxies we see today may be biased tracers of the dark 
matter distribution. That is to say, the relation between the number of galax- 
ies in a randomly placed cell and the amount of dark matter the same cell 
contains, may be rather complicated. In addition, there is evidence that the 
initial fluctuation field was very close to a Gaussian random field. Linear 
and higher order perturbation theory descriptions of gravitational clustering 
from Gaussian initial fluctuations have been developed (see Bernardeau et 
al. [15] for a comprehensive review); these describe the evolution and mildly 
non-linear clustering of the dark matter, but they break down when the clus- 
tering is highly non-linear (typically, this happens on scales smaller than a few 
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Megaparsecs) . Also, perturbation theory provides no rigorous framework for 
describing how the clustering of galaxies differs from that of the dark matter. 

The non-linear evolution of the dark matter distribution has also been stud- 
ied extensively using numerical simulations of the large scale structure clus- 
tering process. These simulations show that an initially smooth matter dis- 
tribution evolves into a complex network of sheets, filaments and knots (e.g., 
figure 1). The dense knots are often called dark matter halos. High resolution, 
but relatively small volume, simulations have been used to provide detailed 
information about the distribution of mass in and around such halos (i.e., 
the halo density profile of [198,195]), whereas larger volume, but lower res- 
olution simulations (e.g., the Hubble Volume simulations [80] of the Virgo 
consortium[278]), have provided information about the abundance and spa- 
tial distribution of halos [135,41]. Simulations such as these show that the 
halo abundance, spatial distribution, and internal density profiles are closely 
related to the properties of the initial fluctuation field. When these halos are 
treated as the analogs of Neyman & Scott's clusters, their formalism provides 
a way to describe the spatial statistics of the dark matter density field from 
the linear to highly non-linear regimes. 

Such a halo based description of the dark matter distribution of large scale 
structure is extremely useful because, following White & Rees [292], the idea 
that galaxies form within such dark matter halos has gained increasing cre- 
dence. In this picture, the physical properties of galaxies are determined by 
the halos in which they form. Therefore, the statistical properties of a given 
galaxy population are determined by the properties of the parent halo popu- 
lation. There are now a number of detailed semianalytic models which imple- 
ment this approach [157,264,42,10]; they combine simple physically motivated 
galaxy formation recipes with the halo population output from a numerical 
simulation of the clustering of the dark matter distribution to make predic- 
tions about how the galaxy and dark matter distributions differ (see, e.g., 
Figure 2). 

In the White & Rees based models, different galaxy types populate different 
halos. Therefore, the halo based approach provides a simple and natural way of 
modelling the dependence of galaxy clustering on galaxy type in these models. 
It is also the natural way of modelling the difference between the clustering 
of galaxies relative to dark matter. 

Just as the number of galaxies in a randomly placed cell may be a biased 
tracer of the amount of dark matter in it, other physical properties such as the 
pressure, the velocity or the momentum, of a cell are also biased tracers of the 
amount of dark matter a cell contains. The assumption that dark matter halos 
are in virial equilibrium allows one to estimate these physical properties for 
any given halo. If the distribution of halos in a randomly chosen cell is known, 
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Fig. 2. Distribution of galaxies (in color) superposed on the dark matter distribution 
(grey scale) in simulations run by the GIF collaboration [157]. Galaxy colors blue, 
yellow, green and red represent successively smaller star formation rates. Different 
panels show how the spatial distributions of dark matter and galaxies evolve; the 
relation between the two distributions changes with time, as do the typical star 
formation rates. 

then the halo-based approach allows one to estimate statistical properties of, 
say, the pressure and the momentum, analogously to how it transforms the 
statistics of the dark matter field into that for galaxies. 

Data from large area imaging and redshift surveys of galaxies (e.g., the 2dF- 
GRS and the Sloan Digital Sky Survey) are now becoming available; these 
will provide constraints on the dark matter distribution on large scales, and 
on galaxy formation models on smaller scales. Weak gravitational lensing [146] 
provides a more direct probe of the dark matter density field. The first gen- 
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eration of wide-field weak lensing surveys, which cover a few square degrees 
are now complete (see recent reviews by Bartlemann & Schneider [7] and 
Mellier [186]), and the next generation of lensing surveys will cover several 
hundreds of square degrees. The Sunyaev-Zel'dovich (SZ) effect [274], due to 
the inverse-Compton scattering of cosmic microwave background (CMB) pho- 
tons off hot electrons in clusters, is a probe of the distribution of the pressure 
on large scales. Several wide-field surveys of the SZ are currently planned (see 
review by Birkinshaw [17]). In addition, the next generation CMB experiments 
will measure temperature fluctuations on small angular scales. On these small 
scales the density, velocity, momentum and pressure fields of the dark and/or 
baryonic matter leave their imprints on the CMB in a wide variety of ways. For 
example, in addition to the thermal and kinematic SZ effects, the small scale 
temperature fluctuations are expected to be weakly lensed. The halo model 
provides a single self-consistent framework for modelling and interpretting all 
these observations. 

The purpose of this review is twofold. The first is to outline the principles 
which underly the halo approach. The second is to compare the predictions 
of this approach with results from simulations and observations. Section 1 in- 
troduces background materials which are relevant for this review. Sections 2 
to 5 present the halo approach to clustering. What we know about dark mat- 
ter halos is summarized in §3, how this information is incorporated into the 
halo model is discussed in §4, and the first result, the halo model description 
of the dark matter density field is presented in §5. The galaxy distribution is 
discussed in §6, the velocity and momentum fields are studied in §7, weak grav- 
itational lensing in §8, and secondary effects on the cosmic microwave back- 
ground, including the thermal and kinetic Sunyaev-Zel'dovich effects [274,206], 
and the non-linear integrated Sachs- Wolfe effect [229,222], are the subject of 
§9. 

We have chosen to discuss those aspects of the halo model which are relevant 
for the statistical studies of clustering, such as the two-point correlation func- 
tions and higher order statistics. We do not discuss what cosmological and 
astrophysical information can be deduced from the redshift distribution and 
evolution of halo number counts. The abundance of halos at high redshift is an 
important ingredient in models of reionization and the early universe. We do 
not discuss any of these models here, since they are described in the recent re- 
view by Barkana & Loeb [8] . Finally, our description of the clustering of halos 
relies on some results from perturbation theory which we do not derive in de- 
tail here. For these, we refer the reader to the recent comprehensive review on 
the pertubation theory description of gravitational clustering by Bernardeau 
et al. [15]. 
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2 Background Materials 



This section describes the properties of adiabatic cold dark matter (CDM) 
models which are relevant to the present review. 

The expansion rate for adiabatic CDM cosmological models with a cosmolog- 
ical constant is 



where Hq can be written as the inverse Hubble distance today cHq 1 = 2997.9/i~ 1 Mpc. 
The critical density is p crit = 3H 2 /87iG. The total density is a sum over differ- 
ent components i, where i = c for the cold dark matter, A for the cosmological 
constant, and b for baryons. Our convention will be to denote the contribution 
to the total density from component i, as Q{ = Pi/pcrit- The contribution of 
spatial curvature to the expansion rate is Qk = 1 — Z)i^i> an d the matter 
density is Vt m = Q c + Q b . 

Convenient measures of distance and time include the conformal distance (or 
lookback time) from the observer at redshift z = 



Note that as Qk — ► 0, d\ — > r and we define r(z = oo) = r . 



2. 1 Statistical description of random fields 

The dark matter density field in the adiabatic CDM model possesses n— point 
correlation functions, defined in the usual way. That is, in real coordinate 
space, the n— point correlation function £ n of density fluctuations 5(x) is de- 
fined by 



h\z) = Hi Q m (i + zf + n K (i + zf + n A 



(1) 





(we have set c = 1) and the angular diameter distance 




(3) 




(4) 
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Here, we have expressed the density perturbations in the universe as fluctua- 
tions relative to the background mean density, p: 



«(r) = *&>-!. (5) 



If all the Xj are the same, then 

(5) c =(5) (6) 

(5 2 ) c =(5")-(5)l^^ (7) 

(5 3 ) c =(5 3 )- 3 (5 2 ) c (5) c -(5)l (8) 

(5 4 ) c = <5 4 > - 4 <5 3 > c (5) c - 3 (Sr c ~ 6(5 2 ) c {5)1 - <5> 4 . (9) 

We will almost always consider the case in which (5) = 0. 

Many of the calculations to follow simplify considerably in Fourier space. 
Throughout, we will use the following Fourier space conventions: 



r d 3 k 

ACx) = / ^4(k) exp(ik • x) and 

J (27rr 



(27T) 

5 D (ki...i) = | exp[-ix • (kx + • • • + ki)] (10) 

for the Dirac delta function, which is not to be confused with the density 
perturbation which does not have the subscript D. 

Thus, the real space fluctuations in the density field is a sum over Fourier 
modes: 

*(x) = J ^-p5(k)exp(ik-x) (11) 



and the two, three and four-point Fourier-space correlations are 



(5(ki) <J(k 2 )) = (2tt) 3 <J D (k 12 ) P(h) , (12) 

(<J(kx) <J(k 2 ) 5(k 3 )) = (2vr) 3 5 D (k 123 ) B{k u k 2 , k 3 ) , (13) 

(<J(kO . . . 5(k 4 )) c = (2tt) 3 5 D (k 1234 ) T(ki, k 2 , k 3 , k 4 ) , (14) 

where kj...j = k; + . . . + k, . The quantities P, B and T are known as the power 
spectrum, bispectrum and trispectrum, respectively. Notice that 

r d 3 k 

6(r)= J 7^W P ^ ex P( ik - r )' ( 15 ) 
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the two-point correlation function and the power spectrum are Fourier trans- 
form pairs. Similarly, we can relate higher order correlations and their Fourier 
space analogies. 

Rather than working with P(k) itself, it is often more convenient to use the 
dimensionless quantity 

which is the power per logarithmic interval in wavenumber. Similarly, we can 
define a scaled dimensionless quantity for the Nth Fourier space correlation 
such that it scales roughly as the logarithmic power spectrum defined above: 

A JV (k 1 ,...,k JV ) = ^[P iV (k 1 ,...,k^)]^ . (17) 

We will also often use the quantity 

- 2 (R)=jf k ^ 1 \W(kR)\^ (18) 

this is the variance in the smoothed density field when the smoothing win- 
dow has scale R. If the window is a tophat in real space, then W(kR) = 
[3/ (A;_R) 3 ](sin kR — (kR) cos kR); it is exp(— k 2 R 2 /2) if the real space smooth- 
ing: window is a Gaussian: ex-p[-(r/R) 2 /2]/V2irR 2 . 

The initial perturbations due to inflation are expected to be Gaussian [268,109,98,5,160], 
so they can be characterized by a power spectrum or a two point correlation 
function (Wick's theorem states that, for a Gaussian field, correlations in- 
volving an odd number of density fluctuations are exactly zero). Thus, the 
bispectrum and trispectrum are defined so that, for a Gaussian field, they are 
identically zero; in the jargon, this means that only the connected piece is 
used to define them, hence the subscript c in the expression equation (14). All 
these quantities evolve. Here and throughout, we do not explicitly write the 
redshift dependence when we believe no confusion will arise. 

2.2 Results from perturbation theory 

The large scale structure we see today is thought to be due to the gravita- 
tional evolution of initially Gaussian fluctuations [217,18,64]. In an expanding 
universe filled with CDM particles, the action of gravity results in the gener- 
ation of higher order correlations: the initially Gaussian distribution becomes 



(16) 
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non-Gaussian. The perturbation theory description of the gravitational evolu- 
tion of density perturbations is well developed [15]. Here we briefly summarize 
some of the results which are most relevant for what is to follow. 

The evolution of large scale structure density perturbations are governed by 
the continuity equation, 

« + iv.(l + *)» = 0, (19) 

and the Euler equation, 

— + Hu+-[(u- V)u + V0] = 0, (20) 
at a 

where the potential fluctuations due to density perturbations are related by 
the Poisson equation: 

V 2 = 47rGpa 2 5, (21) 

while the peculiar velocity is related to the Hubble flow via 

u = v-ifx. (22) 



The linear regime is the one in which 5 <C 1. In the linear regime, the continuity 
and Euler equations may be combined to yield [216] 



dt 2 



2H— - 4nGp5 = . 



(23) 



This is a second-order differential equation with two independent solutions; 
these correspond to modes which grow and decay with time. For our purposes, 
only the growing mode solution of equation (23) is relevant. This has the form 
[216] 



5(k,r) = G(r)5(k, 0), 



(24) 



where 



H(r) 7 
G(r) oc-y / dz'{l + z'- 

Hq J 



z(r) 



H{z>) 

n m (z)/(i + z) 



2 n m (z) 4/r - n A (z) + (i - n m (z)/2)(i + n A (z)/70) ' 



(25) 
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This shows that the linear theory density field may be scaled in time, or 
redshift, with the use of the growth solution G(z). Note that G oc a = + 
as Q m — > 1. The approximation in the second line of equation (25) is good to 
a few percent [164,167,34]. 

In linear perturbation theoryQ, the power spectrum of the initial density fluc- 
tuation field is 

™^6y" +Vw ' (26) 



Here, n is said to be the slope of the initial spectrum. A scale free form 
for P(k) ~ k n is rather generic; models of inflation generally produce n ~ 
1 (the so-called Harrison-Zel'dovich spectrum [108,299,214]). The quantity 
T(k), defined such that T(0) = 1, describes departures from the initially 
scale free form. Departures are expected because the energy density of the 
Universe is dominated by radiation at early times but by matter at late times, 
and the growth rate of perturbations in the radiation dominated era differs 
from that in the matter dominated era. The transition from one to the other 
produces a turnover in the shape of the power spectrum [20,18]. Baryons and 
other species, such as massive neutrinos, leave other important features in 
the transfer function which can potentially be extracted from observational 
data. Accurate fitting functions for T(k) which include these effects have been 
available for some time [6,117,73]. When illustrating calculations presented in 
this review, we use fits to the transfer functions given by [74]. 

When written in comoving coordinates, the continuity equation (19) shows 
that the Fourier transforms of the linear theory (i.e., when 5 <C 1) density and 
velocity fields are related [216]: 

u(k) = -iG5(fc)^, (27) 



where the derivative is with respect to the radial distance r(z) defined in 
equation (2). This shows that the power spectrum of linear theory velocities 
is 

P^(k) = %P lm (k) . (28) 



It should be understood that "fin" denotes here the lowest non-vanishing order 
of perturbation theory for the object in question. For the power spectrum, this is 
linear perturbation theory; for the bispectrum, this is second order perturbation 
theory, etc. 
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The fluctuations in the linear density field are also simply related to to those in 
the potential [4] . In particular, the Fourier transform of the Poisson equation 
(21) shows that 



*(*) = §Mt 



1+3 { t) 2qk 



§ ) 6(k) 



(29) 



Since gravity induces higher order correlations in the density field, pertur- 
bation theory can be used to calculate them also. The bispectrum, i.e., the 
Fourier transform of the three point correlation function of density perturba- 
tions, can be calculated using second order perturbation theory [93,178,132]: 



5 Iin (k p , k„ k r ) = 2F 2 s (k p , \)P{k p )P{k q ) + 2 Perm, 



(30) 



where 



F 2 s ( qi ,q 2 ) = - 



qi • q 2 




(qi • 



(31) 



The bispectrum depends only weakly on Q m , the only dependence coming 
from the fact that /i « (3/7)fi m 2 / 63 for 0.05 < Vt m < 3 [153]. 

The expressions above show that, in perturbation theory, the bispectrum gen- 
erally scales as the square of the power spectrum. Therefore, it is conventional 
to define a reduced bispectrum: 

Q = ^ (32) 



To lowest order in perturbation theory, Q is independent of time and scale 
[85,86]. When the k vectors make an equililateral triangle configuration, then 



Qcq(k) 



AL(k) 



A 2 (k) 



where A 2 cq (k) = ^ B(k,k,k) (33) 



2tt 2 



represents the bispectrum for equilateral triangle configurations. In second 
order perturbation theory, 

Q c p q T = i - ^ m 2/63 ; 04) 



this should be a good approximation on large scales. 
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Similarly, the perturbation theory trispectrum is 



T lin = 4 [F 2 s (k 12 , -kOi^ku, kg) Pi^Pih^Pih) + Perm.] 

+6 [F 3 s (k 1? k 2 , k 3 )P(A; 1 )P(A; 2 )P(A;3) + Perm.] ; (35) 

there are 12 permutations in the first set and 4 in the second [86]. The function 
Ff can be derived through a recursion relation [93,132,178] 



n— 1 z^s „ 

^„(qi, ...,q„J - 2^ 



f x (n- l)(2n + 3) 



(2n + 1) F n _ m (q m +i, q n ) 



qi,m ■ qi 



+ 



(qi,n • qi,n)(qi 

(qi,m ' qi,m)(qm+l,n ' 1m+l,n) 

with q ajfe = q a + ... + q 6 , Ff = Gf = 1 and 



G?_ m (q m+ i,...,q n ) (36) 



C s 2 (qi,q2) =/U + 



i qi • q2 fgi 



2 <?1<?2 V<?2 9i 



2 2 



(37) 



where /x has the dependence on Q m as in Ff (equation 31). The factor of 2 in 
equation (30) and the factors of 4 and 6 in equation (35) are due to the use 
of symmetric forms of the F*. Once again, it is useful to define 

Ql2U " [PiP2P 1 3 + cycI 1 +[P 1 P 2 P3 + cyc.]' (38) 



where the permutations include 8 and 4 terms respectively in the ordering of 
(ki, k 2 , k 3 , fc 4 ). For a square configuration, 

Q (k)~ r (k,-k,k ± ,-k ± ) 



In perturbation theory, Q sq m 0.085. 

The perturbation theory description of clustering also makes predictions for 
correlations in real space. For clustering from Gaussian initial conditions, the 
higher order moments of the dark matter distribution in real space satisfy 

(S n ) = Sn(P) n - 1 , if (<5 2 >«1, (40) 



where the S n are numerical coefficients which are approximately independent 
of scale over a range of large scales on which (5 2 ) <C 1. These coefficients are 
[12] 
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olin 34 olin 60712 62 7 2 2 

S3 =y + 7i, ^4 = 1^2y + y^ + 3^1 + 3^' and 

o]hl 200575880 1847200 ' ' 1490 50 10 

55 = 305613 + ^969~ 7l + "63~ 72 + Y 7l72 + 27 73 ' (41) 



where 



d(lnR)> 1 ' 



and is defined by inserting the linear theory value of P lm (k) in equa- 

tion (18). Note that 71 = — {n + 3) and ji = for i > 1 if the initial spectrum 
is a power-law with slope n. For the CDM family of spectra, one can ne- 
glect derivatives of cr 2 {R) with respect to scale for R < 20 h^ 1 Mpc. Also 
note that the S l ™ depend only slightly on cosmology: e.g., the skewness is 
5f n = 4 + f fi m 2/63 + 71 [29,114,84]. 

Although all derivations to follow will be general, we will often illustrate our 
results with the currently favored ACDM cosmological model. Following the 
definition of the expansion rate (equation 1) and the power spectrum of linear 
density field (equation 26), the relevant parameters for this model are Q c = 
0.30, tt b = 0.05, Q A = 0.65, h = 0.65, and n = 1. 

The associated power spectrum of linear fluctuations is normalized to match 
the observed anisotropy in the cosmic microwave background at the largest 
scales, i.e., those measured by the COBE mission. This means [32] that we 
set 8h = 4.2 x 10~ 5 . A constraint on the shape of the spectrum also follows 
from specifying the amplitude of the power spectrum on a smaller scale. This 
additional constraint is usually phrased as requiring that a%, the rms value of 
the linear density fluctuation field when it is smoothed with a tophat filter of 
scale R = 8/i -1 Mpc (i.e., a s is calculated using equation 18 with R = 8/i -1 
Mpc), has a specified value. This value is set by requiring that the resulting 
fluctuations are able to produce the observed abundance of galaxy clusters. 
Uncertainties in the conversion of X-ray flux, or temperature, to cluster mass, 
yield values for cig which are in the range (0.5-0.6)fi-(°- 5 -°- 6 ) [77,284]. Another 
constraint on the value of ag is that when properly evolved to past, the same 
density power spectrum should also match associated fluctuations in the CMB; 
the two constraints are generally in better agreement in a cosmology with a 
cosmological constant than in an open universe. The use of a realistic value for 
cr$ is important because higher order correlations typically depend non-linearly 
on the amplitude of the initial linear density field. 
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2.3 Beyond perturbation theory 



The following ansatz, due to [102], provides a good description of the two- 
point correlations in real space, £2(7"), an d in Fourier space, P(k), even in 
the regime where perturbation theory becomes inaccurate. The argument is 
that non-linear gravitational evolution rescales all lengths, so pairs initially 
separated by scale tl will later be separated by a different scale, say tnl- The 
initial and final scales are related by 



rL 



[l + WrNL)] 1 / 3 



where 



-,NL 



,(x) = A fdrr 2 Ur). (43) 





The ansatz is that there exists some universal function 

^nl(^nl) = ^nl[^l(^l) • 



(44) 



This is motivated by noting that £nl(?"nl) oc £l(?"l) in the linear regime where 
£nl "C 1, and £nl(?"nl) oc [cjl(^l)] 3 ^ 2 in the highly non-linear regime where 
£ 1. The 3/2 scaling comes if £nl oc a 3 in the highly non-linear regime 
(this is expected if, on the smallest scales, the expansion of the background is 
irrelevant), whereas £l oc a 2 . In the intermediate regime, it has been argued 
that ^nl(^nl) oc [CiX^l)] 3 [202]. The exact transitions between the different 
regimes, however, must be calibrated using numerical simulations. 

For similar reasons, one expects a scaling for the non-linear power spectrum 



k 



NL 



[1 + A nl (A;nl)] 1/3 



with A NL (A; NL ) = /nl A L (fc L ) . (45) 



Fitting formulae for Xnl and /nl, obtained by calibrating to numerical sim- 
ulations, are given in [209,131]; in what follows we will use the fits given by 
[210] for the power spectrum with 



hh(x) = x 



a/3 



1 + B(3x + (Ax) 
l + dAxfgyiVx 1 / 2 })^ 



1/(3 



(46) 



where, 



A = 0.482 1 + 



a = 3.310 1 + 



n 



n 



-0.947 



-0.244 



B = 0.226 1 + 



n 



(3 = 0.862 1 + 



n 



-1.778 



-0.287 
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n \ -0.423 

1^ = 11.55(1 + -] , (47) 



and 



. dlnP , . 

n(k L ) = — — . 48 

y ' d\nk k=k L /2 v ; 



Note that in equation (46), the redshift dependence comes only from the factor 
of g 3 , where g is the growth suppression factor relative to an Q m = 1 universe: 
g = (1 + z) G(z), with G(z) the linear growth factor given in equation (25). 
The parameters of the above fit come from a handful of simulations and are 
valid for a limited number of cosmological models. Fits to the non-linear power 
spectrum in some cosmological models containing dark energy are provided 
by [171]. 

Although this ansatz, and the associated fitting function represents a signifi- 
cant step beyond perturbation theory, there have been no successful extensions 
of it to higher order clustering statistics. In addition, it is not obvious how to 
extend it to describe fields other than the density of dark matter. 

Hyper-extended perturbation theory (HEPT; [238]) represents a reasonably 
successful attempt to extract what is known from perturbation theory and 
apply it in the highly nonlinear regime. This model makes specific predictions 
about higher order clustering. For example, 



54 - 27 • 2 n + 2 • 3 n + 6 r 



1 + 6 • 2 n + 3 • 3™ + 6 • 6™ 



,(49) 



where n(k) is the linear power spectral index at k. Fitting functions for Q eq (k) 
for 0.1 < k < 3h Mpc -1 , calibrated from numerical simulations, can be found 
in [239]. 



3 Dark Matter Halo Properties 



In this section, and the next, we will describe an approach which allows one 
to describe all n— point correlations of large scale structure. This description 
can be used to study clustering of a variety of physical quantities, includ- 
ing the dark matter density field, the galaxy distribution, the pressure, the 
momentum, and others. 

The approach assumes that all the mass in the Universe is partitioned up into 
distinct units, which we will often call halos. If distinct halos can be identified, 
then it is likely that they are small compared to the typical distances between 
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them. This then suggests that the statistics of the mass density field on small 
scales are determined by the spatial distribution within the halos; the precise 
way in which the halos themselves may be organized into large scale structures 
is not important. On the other hand, the details of the internal structure of 
the halos cannot be important on scales larger than a typical halo; on large 
scales, the important ingredient is the spatial distribution of the halos. This 
realization, that the distribution of the mass can be studied in two steps: the 
distribution of mass within each halo, and the spatial distribution of the halos 
themselves, is the key to what has come to be called the halo model. 

The halo model assumes that, in addition to thinking of the spatial statistics 
in two steps, it is useful and accurate to think of the physics in two steps 
also. In particular, the model assumes that the regime in which the physics 
is not described by perturbation theory is confined to regions within halos, 
and that halos can be adequately approximated by assuming that they are in 
virial equilibrium. 

Clearly, then, the first and the most important step is to find a suitable def- 
inition of the underlying units, i.e. the halos. This section describes what we 
know about the abundance, spatial distribution, and internal density profiles 
of halos. All these quantities depend primarily on halo mass. In the next sec- 
tion, we combine these ingredients together to build the halo model of large 
scale structure. 



3.1 The spherical collapse model 

The assumption that non-linear objects formed from a spherical collapse is a 
simple and useful approximation. The spherical collapse of an initially tophat 
density perturbation was first studied in 1972 by Gunn & Gott [96]; see [82,16] 
for a discussion of spherical collapse from other initial density profiles. 

In the tophat model, one starts with a region of initial, comoving Lagrangian 
size Rq. Let 5i denote the initial density within this region. We will suppose 
that the initial fluctuations were Gaussian with an rms value on scale Rq 
which was much less than unity. Therefore, \5i\ <C 1 almost surely. This means 
that the mass M within R is M = (47ri^/3)p(l + ^) « (4irR 3 ) /3)p where p 
denotes the comoving background density. 

As the Universe evolves, the size of this region changes. Let R denote the 
comoving size of the region at some later time. The density within the region 
is (Rq/R) 3 = (1 + 5). In the spherical collapse model there is a deterministic 
relation between the initial comoving Lagrangian size Rq and density of an 
object, and its Eulerian size R at any subsequent time. For an Einstein-de 
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Sitter universe, one can obtain a parametric solution to R(z) in terms of 9: 

R(z) (1 + z) (l-cos0) . 1 /3\ 2 / 3 (9 -sm9) 2 / 3 . . 
- and — — = - 7rTo\Ts\ » ( 50 ) 



i? (5/3)|5 | 2 1 + z W (5/3) 1 5, 



where So denotes the initial density 5{ extrapolated using linear theory to the 
present time (e.g. [216]). If Si < 0, then (1 — cos#) should be replaced with 
(coshtf - 1) and (9 - sm9) with (sinhtf - 9). 

In the spherical collapse model, initially overdense regions collapse: with 9 = 
at start, they 'turnaround' at 9 = n, and have collapsed completely when 
9 = 2n. Equation (50) shows that the size of an overdense region evolves as 

R 6 2 / 3 (0-sin0) 2 / 3 

(51) 



R(z) 2 (l-cos0) 



At turnaround, 9 = 7r, so [R / R(z ta ,)} 3 = (37r/4) 2 ; when an overdense region 
turns around, the average density within it is about 5.55 times that of the 
background universe. 

At collapse, the average density within the region is even higher: formally, 
R{z C oi) = 0, so the density at collapse is infinite. In practice the region does 
not collapse to vanishingly small size: it virializes at some non-zero size. The 
average density within the virialized object is usually estimated as follows. 
Assume that after turning around the object virializes at half the value of 
the turnaround radius in physical, rather than comoving units. In the time 
between turnaround and collapse, the background universe expands by a factor 
of (1 + 2ta)/(l + z coi) = 2 2 / 3 (from equation 50), so the virialized object is 
eight times denser than it was at turnaround (because R vw = R ta /2). The 
background density at turnaround is (2 2 / 3 ) 3 = 4 times the background density 
at z vir . Therefore, the virialized object is 

A vir = (9tt 2 /16) x 8 x 4 = 18tt 2 , (52) 



times the density of the background at virialization. 

What was the initial overdensity of such an object? The first of equations (50) 
shows that if the region is to collapse at z, the average density within it must 
have had a critical value, S sc , given by 

S R Jz) 3 /3tt\ 2 / 3 



1 + z 5 



(f) m 



Thus, a collapsed object is one in which the initial overdensity, extrapolated 
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using linear theory to the time of collapse, was 5 sc (z). At this time, the actual 
overdensity is significantly larger than the linear theory prediction. Although 
the formal overdensity is infinite, a more practical estimate (equation 52) says 
that the object is about 178 times denser than the background. 

There is an important feature of the spherical collapse model which is ex- 
tremely useful. Since (1+5) = (R/Rq) 3 , the equations above provide a relation 
between the actual overdensity 5, and that predicted by linear theory, 5 , and 
this relation is the same for all R . That is to say, it is the ratio Rj 1 R Q which 
is determined by 4, rather than the value of R itself. Because the mass of the 
object is proportional to Rq, this means that the critical density for collapse 
5 SC is the same for all objects, whatever their mass. In addition, the evolution 
of the average density within a region which is collapsing is also independent 
of the mass within it (of course, it does depend on the initial overdensity). 

To see what this relation is, note that the parametric solution of equation (50) 
can be written as a formal series expansion, the first few terms of which are 
[13] 

6 o ^ 17 r2 341 r3 55805 r4 

— — = \^ a k <T = 5 5 H <r 5 A + ... (54) 

1 + z k 21 567 130977 1 ; 



To lowest order this is just the linear theory relation: 5 is the initial 5q times 
the growth factor. A good approximation to the spherical collapse relation 
5q(5), valid even when 5 3> 1, is [190] 

5 _3(12tt) 2 / 3 1.35 1.12431 0.78785 

TTI~ 20 (1 + 5) 2 /3 ~~ (1 + 5)1/2 + (J + 5 )0.58661 " ( > 



While these are all convenient estimates of the parameters of collapsed ob- 
jects, it is important to bear in mind that the collapse is seldom spherical, 
and that the estimate for the virial density is rather adhoc. Descriptions of 
ellipsoidal collapse have been considered [129,21,194,257], as have alternative 
descriptions of the S (S) relation [79]. In most of what follows, we will ignore 
these subtleties. 

Though we have used an Einstein-de Sitter model to outline several properties 
related to spherical collapse, our discussion remain qualitatively similar in 
cosmologies for which Q m < 1 and/or Qa > 0. The actual values of 4c and A vir 
depend on cosmology: fitting functions for these are available in the literature 
[77,198,197,24,112]. 
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3.2 The average number density of halo s 



Let n(m, z) denote the comoving number density of bound objects, halos, of 
mass m at redshift z. (Some authors use dn/dm to denote this same quantity, 
and we will use the two notations interchangeably.) Since halos formed from 
regions in the initial density field which were sufficiently dense that they later 
collapsed, to estimate n(m,z), we must first estimate the number density of 
regions in the initial fluctuation field which were dense enough to collapse. A 
simple model for this was provided by Press & Schechter in Ref. [219]: 

m 2 n(m,z) dm dv 

= vf(v) — , 56 

p m v 



where p is the comoving density of the background with 

"/(") = exp(-,/2), and v= $j&L (57) 



Here S sc (z) is the critical density required for spherical collapse at z, extrap- 
olated to the present time using linear theory. In an Einstein-de Sitter cos- 
mology, 5 sc (z = 0) = 1.686 while in other cosmologies, 5 SC depends weakly on 
Q m and Qa [77]. In equation (57), a 2 {m) is the variance in the initial density 
fluctuation field when smoothed with a tophat filter of scale R = (3m/47rp) 1 / 3 , 
extrapolated to the present time using linear theory: 

<{m) = j^ k ^^\W{kR)\\ (58) 



where W(x) = (3/x 3 ) [sin (a;) — xcos(a:)]. 

A better fit to the number density of halos in simulations of gravitational 
clustering in the CDM family of models is given by Sheth & Tormen [254]: 

uf(u) = A(p) (l + (guy?) V2 exp(-qv/2), (59) 



where p w 0.3, A(p) = [1 + 2' p r(l/2 - p)/^' 1 w 0.3222, and q w 0.75. If 
p — 1/2 and q — 1, then this expression is the same as that in equation (57). 
At small v <^ 1, the mass function scales as vj{y) oc z/°- 5_p . Whereas the small 
mass behavior depends on the value of p, the exponential cutoff at v ^> 1 does 
not. The value v — 1 defines a characteristic mass scale, which is usually 
denoted m*: o{m*) = S sc (z) and is ~ 2 x 10 13 Mq at z = 0; note that halos 
more massive than m* are rare. 
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lnfj- 1 

Fig. 3. The halo mass function in numerical simulations of the Virgo collabo- 
ration. The measured mass distribution is show in color; dashed line shows the 
Press-Schechter mass function; dotted line is a fitting formula which is similar to 
the Sheth-Tormen mass function. The figure is from [135]. 

Elegant derivations of equation (57) in [71,211,19] show that it can be re- 
lated to a model in which halos form from spherical collapse. When extended 
to the ellipsoidal collapse model described by [21], the same arguments give 
equation (59) [257,256]. Alternative models for the the shape of n(m,z) are 
available in the literature [1,180,166,106]; we will not consider them further, 
however, as equation (59) has been found to provide a good description of the 
mass function in numerical simulations. 

This is shown in Figure 3, which is taken from numerical simulations run 
by the Virgo collaboration [135]. The jagged lines show the mass function at 
various output times in the simulation rescaled from mass m to er(m). The 
figure shows that, when rescaled in this way, 

mdn{m,z) 

f(<T,z) = —— — , (60) 

p a in a 1 
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is a universal curve (results from all output times in the simulations trace out 
approximately the same curve). The dashed line shows that this distribution 
of halo masses is not so well described by equation (57). The dotted line shows 



f(a) = 0.315exp(-|lncr- 1 + 0.61| 3 - 8 ) ; 



(61) 



this fitting formula is accurate to 20% in the range —1.2 < lno" -1 < 1.05 [135]. 
It is very well described by equation (59), which is physically motivated, and 
so it is equation (59) which we will use in what follows. 

3. 3 The number density of halos in dense regions 

Suppose we divide space up into cells of comoving volume V. The different cells 
may contain different amounts of mass M, which means they have different 
densities: M/V = p{\ + 5). Let N(m, z±\M, V, z ) denote the average number 
of m halos which collapsed at z±, and are in cells of size V which contain mass 
M at z . The overdensity of halos in such cells is 



Since we already have a model for the denominator, to proceed, we need a 
good estimate of N(m, Z\\M, V, z ). 

A halo is a region which was sufficiently overdense that it collapsed. So the 
number of halos within V equals the initial size of V times the number density 
of regions within it which were sufficiently dense that they collapsed to form 
halos. If V is overdense today, it's comoving size is smaller than it was initially; 
the initial comoving size was Mj p = V(l + 5). If we write N(m, z±\M, V, Zq) = 
n(m, Zi\M, V, z )V(l + 5), then we need an estimate of the number density 
n(m, Zi\M, V, z ). 

The average number density of halos n(m, z\) is a function of the critical den- 
sity required for collapse at that time: S sc (zi). In the present context, n(m,z) 
should be thought of as describing the number density of halos in extremely 
large cells which are exactly as dense as the background (i.e., cells which have 
M — > oo and 5 = 0). Denser cells may be thought of as regions in which 
the critical density for collapse is easier to reach, so a good approximation to 
n(m, Zi\M, V, z ) is obtained by replacing 5 sc (z) in the expression for n(m, z) 
with 5 sc {zi) — S (5, z ) [190]. Note that we cannot use 5 itself, because 5 sc (zi) 
has been extrapolated from the initial conditions using linear theory, whereas 
5, being the actual value of the density, has been transformed from its value 



5 h (m, z!\M,V,zo) 



N{m, Zl \M,V,z Q ) 
n{m, Zi)V 



- 1. 



(62) 
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in the initial conditions using non-linear theory. Equations (54) and (55) show 
the spherical collapse model for this non-linear Sq(S, z) relation. Here, Sq(S, Zq) 
denotes the initial density, extrapolated using linear theory, which a region 
must have had so as to have density 8 at z . 



Thus, a reasonable estimate of the density of m-halos which virialized at z\ 
and are in cells of size V with mass M at z is 

m 2 n(m,z 1 \M,V,z ) dm _ du 10 __ [5 sc (zi) - S (S, z )f 

— vio jyyiQ) wnere v w — 



m uio a 2 (m) — a 2 (M) 



and f(u) is the same functional form which described the unconditional mass 
function (equation 57 or 59). 

Two limits of this expression are interesting. As V — > oo, 5 — > oo and 5 — > 
5 sc (z ) independent of the value of M. A region of small size which contains 
mass M, however, is what we call a halo, with mass M. Thus, if we are given 
a halo of mass M at zq, then N(m, Z\\M,V = 0,zq) is the average number 
of subclumps of mass m it contained at the earlier time when z\ > zq. This 
limit of equation (63) gives what is often called the conditional or progenitor 
mass function [23,19,163,255]. The opposite limit is also very interesting. As 
V — > oo, M — > oo as well: in this limit, cr 2 (M) — > and \5\ — > 0, and so 
equation (63) reduces to n(m,Zi), as expected. 

Suppose that we are in the large cell limit. By large, we mean that the rms 
density fluctuation in these cells is much smaller than unity. Thus, \8\ 1 
in most cells and we can use equation (54) for 5q(5). Large cells contain large 
masses, so M in these cells is much larger than the mass m* of a typical 
halo. In this limit, a(M) <^ a(m) for most values of m allowing one to set 
a{M) -> 0. This leads to 

n(m, zi\M, V, z ) » n{m, zi) - 5 (5, z ) ( ™' ) +•••, (64) 

V dS ^ Js sc ( Zl ) 



such that 



5 h (m,z 1 \M,V,z )*5-(l + 5)5o(5,z ) ( dlnn ^ z ^ \ . (65) 

V sc / S BC ( Zl ) 



Inserting equation (59) for n(m, z) and keeping terms to lowest order in S give 
(e.g., [45,190,252,254]) 

5 h (m, zi\M, V, z ) ^s(l + + f(^ ) - &iK *i) 6, (66) 
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where v = S 2 c (zi) / a 2 (to) . This expression states that the overdensity of halos 
in very large cells to be linearly proportional to the overdensity of the mass; 
the constant of proportionality, bi(m, zi), depends on the masses of the halos, 
and the redshifts they virialized, but is independent of the size of the cells. 

If q — 1 and p — 0, then massive halos (those which have v > 1 or masses 
greater than the characteristic mass scale of to*) have &i(to, z\) > 1 and are 
said to be biased relative to the dark matter, while less massive halos [y < 1) 
are anti-biased. Notice that b\ can be very large for the most massive halos, 
but it is never smaller than 1 — l/5 sc (zi). Equation (53) shows that halos which 
virialized at the present time (i.e., Z\ = 0), have bias factors which are never 
less than ps 0.41. Since 5 sc (zi) ^> 1, in equation (53), halos that virialized at 
early times have bias factors close to unity (See [204,87,276] for a derivation 
of this limiting case which uses the continuity equation.) 

Since M/V = p(l + 8), the results above show that, in large cells, n(m\5) ~ 
[1 + bi(m)5]n(m). Since bi(m) 3> 1 for the most massive halos, they occupy 
the densest cells. It is well known that the densest regions of a Gaussian ran- 
dom field are more strongly clustered than cells of average density [227,143,6]. 
Therefore, the most massive halos must also be more strongly clustered than 
low mass halos. This is an important point to which we will return shortly. 

3.4 The distribution of halos on large scales: Deterministic biasing 

The linear bias formula is only accurate on large scales. If we write 



then inserting equation (63) in equation (62), setting a(M) — > 0, and expand- 
ing gives [236,189] 

6i(m,zi) = 1 + d + E u 

b 2 (m, z 1 ) = 2{l + a 2 ){e 1 + E 1 ) + e 2 + E 2) 

b 3 (m, z 1 ) = Q(a 2 + a 3 )(e 1 + E x ) + 3(1 + 2a 2 )(e 2 + E 2 ) + e 3 + E 3 , 
6 4 (m, z^ = 24(a 3 + a 4 )(ei + E{) + Yl\a\ + 2(a 2 + a 3 )](e 2 + E 2 ) 



5 h (m, zi\M, V, zq) = b k( m i z i) & k 



(67) 



+4(1 + 3a 2 ) (e 3 + E 3 ) + e 4 + E 4 (68) 



for the first few coefficients. Here 
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qv 

/ec(zi), 



' q 2 v 2 



10qu+ 15 ' 



(69) 



and 



£1 = 
Si 



1 + (gz/)p ' Ex 
2gz/ / 2q 2 v 2 



l + 2p £3 _ 4(p 2 - 1) + Qpqu 2 

U*i) + ^" Si( Zl ) +3Cl ' 



15ei 



+2 



(1 + p) /4(p 2 - 1) + 8(p - + 3 



4c (^l) 



+ 6gi/ei 



(70) 



If p = 0, all the i? fc s are also zero, and these expressions reduce to well known 
results from [189]. By construction, note that the bias parameters obey con- 
sistency relations: 



/ 



mn(m,z) . 
am = b k (m, z) = < 



1 if k = 1 
if k > 1 



(71) 



Figure 4 compares these predictions for the halo bias factors with measure- 
ments in simulations (from [254]). Note that more massive halos tend to be 
more biased, and that halos of the same mass were more strongly biased at high 
redshift than they are today. The solid and dotted lines show the predictions 
from equation (68) with equations (57) and (59) for the halo mass function, 
respectively. Figure 3 shows that equation (57) predicts too few massive halos; 
as a result, it predicts a larger bias factor for these massive halos than is seen 
in the simulations. Equation (59) provides an excellent fit to the halo mass 
function; the associated bias factors are also significantly more accurate. 

The expressions above for the bias coefficients are obtained from our expression 
for the mean number of halos in cells V which contain mass M. If the relation 
between 5 h and 5 is deterministic, that is, if the scatter around the mean 
number of halos at fixed M and V is small, then the distribution of halo 
overdensities is related to that of the dark matter overdensities by a non- 
linear transformation; the coefficients bk describe this relation. Thus, if the 
dark matter distribution at late times is obtained by a transformation of the 
initial distribution, then the halos are also related to the initial distribution 
through a non-linear transformation. 

While a deterministic relation between 5^ and 5 is a reasonable approximation 
on large scales, on smaller scales the scatter is significant [190]. On small scales, 
the bias is both non-linear and stochastic. Accurate analytic models for this 
stochasticity are presented in [252,35], but we will not need them for what 
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Fig. 4. Large scale bias relation between halos and mass (from [254]). Symbols 
show the bias factors at z ^ s for objects which were identified as virialized halos 
at Zform = 4,2,1 and (top to bottom in each panel). Dotted and solid lines show 
predictions based on the Press-Schechter and Sheth-Tormen mass functions. 

follows. Also, ignored in what follows are: i) the deterministic bias coefficients 
6fc(m, zi) which follow from the assumption that halos are associated with 
peaks in the initial density field [189]; ii) the deterministic bias coefficients 
which are motivated by perturbation theory rather than the spherical collapse 
model [36]. (Also, see [91] for a discussion of the relation between perturbation 
theory and the coefficients in the expressions above). 

On large scales where deterministic biasing is a good approximation, the vari- 
ance of halo counts in cells is 

(5 h (m, Zl \M,V,z ) 2 ) = ((j2h(m,z 1 )5 k ) ) « 6?(m, z x ) (8 2 ) (72) 

Thus, to describe the variance of the halos counts we must know the variance 
in the dark matter on the same scale: (5 2 )y. On very large scales, it should 
be a good approximation to replace (5 2 ) by the linear theory estimate. On 
slightly smaller scales, it is better to use the perturbation theory estimates 
of [236]. Given these, the variance of halo counts in cells on large scales is 
straightforward to compute. 

The higher order moments can also be estimated if the biasing is deterministic. 
This is because equation (67) allows one to write the higher order moments 



27 



lO 



100 



10 



CO 








6 



b 



Fig. 5. Higher order moments of the halo distribution if the initial fluctuation spec- 
trum is scale free and has slope n = —1.5. Dotted and solid curves show the result of 
assuming the halo mass function has the Press-Schechter and Sheth-Tormen forms. 

of the halo distribution, (5^), in terms of those of the dark matter, (S n ). 
Quasi-linear perturbation theory shows that (5 n ) = S n (5 2 ) ri ~ 1 if (5 2 ) <C 1 
(see equation 40). The S n are numerical coefficients which are approximately 
independent of scale over a range of scales on which (5 2 ) <C 1; for clustering 
from Gaussian initial conditions, the S n are given by equation (41). By keeping 
terms to consistent order, one can show that [88] 



S£(m,z 1 \M,V,z Q )) = H, 



5 2 h (m, Zl \M,V,z : 



(73) 



where 



H 3 = b- L (S 3 + 3c 2 ), H 4 = b~ 2 (S 4 + 12c 2 S 3 + 4c 3 + I2c 2 2 ), 



H 5 = V + 2OC2S4 + 15c 2 ^3 + (30c 3 + 120c^)S 3 + 5c 4 + 60c 3 c 2 + 60c: 

Cfc = bk/bi and we have not bothered to write explicitly that H n depends on 
halo mass and on the cell size V. 

The expressions above show that the distribution of halos depends explicitly 
on the distribution of mass. On large scales where the relation between halos 
and mass is deterministic, one might have thought that the linear theory 
description of the mass distribution can be used. For clustering from Gaussian 
initial conditions, however, linear theory itself predicts that S n = for all 
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n > 2. Therefore, to describe the halo distribution, it is essential to go beyond 
linear theory to the quasi-linear perturbation theory. 

Figure 5 shows an example of how the first few H n depend on halo mass, 
parameterized by b(m). Note that, in general, the less massive halos (those 
for which b < 1) have larger values of H n . This is a generic feature of halo 
models. At high masses (6 3> 1) both sets of curves asymptote to H n = n n ~ 2 . 

Before moving on to the next subsection, consider some asymptotic properties 
of the bk in equation (68), and of the high order moments H n derived from 
them. For small halos [y <C 1) identified at early times (z\ ^> 1), b\ ~ 1 and 
bk ~ for k > 1. Therefore H n = S n and such halos are not biased relative 
to mass. In contrast, when v 1 and z\ is not large, i.e. for massive halos 
identified at low redshift, bk — b\ for k > 1. In this limit, the H n are indepen- 
dent of both S n and a^. Therefore, the spatial distribution of these halos is 
determined completely by the statistical properties of the initial density field 
and are not modified by the dynamics of gravitational clustering. In the limit 
of v 3> 1, for an initially Gaussian random field, H n = n n ~ 2 ; these are the 
coefficients of a Lognormal distribution which has small variance. This shows 
that the most massive halos, or the highest peaks in a Gaussian field, are not 
Gaussian distributed. 

For small halos identified at low redshift (u <C 1 and z\ <C 1), b x m 1 — l/8 sc (zi) 
and bk ~ — /c!(afc-i + a>k)/8sc{zi) for k > 2. In this case H n may depend 
significantly on the dynamical evolution of the underlying mass density field. 
The skewness of such halos, H 3 , can be larger than S3. On the other hand, for 
halos with v — 1, the skewness is H 3 = S 3 — 6/5 2 c (zi), which is substantially 
smaller than S3 unless Z\ is high. 

The most important result of this subsection is that, in the limit in which 
biasing is deterministic, the bias parameters which relate the halo distribution 
to that of the mass are completely specified if the halo abundance, i.e., the 
halo mass function, is known. If perturbation theory is used to describe the 
distribution of the mass, then these bias parameters allow one to describe the 
distribution of the halos. The perturbation theory predictions and the halo 
mass function both depend on the shape of the initial power spectrum. Thus, 
in this model, the initial fluctuation spectrum is used to provide a complete 
description of halo biasing. 

3. 5 Halo density profiles 

Secondary infall models of spherical collapse [82,16] suggest that the density 
profile around the center of a collapsed halo depends on the initial density 
distribution of the region which collapsed. If halos are identified as peaks in 



29 




Fig. 6. Distribution of dark matter around halo centers (from [198]). The density is 
in units of 10 10 MQ/kpc 3 and radii are in kpc. Different panels show the density 
profile around the least and most massive halos in simulations of a wide variety of 
cosmological models and initial power spectra (labelled by the density parameter 
and spectral index). Arrows show the softening length; measurements on scales 
smaller than this are not reliable. Solid lines show the NFW fit to the density 
distribution is extremely accurate. 



the initial density field [143,116], then massive halos correspond to higher 
peaks in the initial fluctuation field. The density run around a high peak is 
shallower than the run around a smaller peak [6]: high peaks are less centrally 
concentrated. Therefore one might reasonably expect massive virialized halos 
to also be less centrally concentrated than low mass halos. Such a trend is 
indeed found [198]. 
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Fig. 7. Mean concentration at fixed mass, ca = r v i r /r s , for dark matter halos as 
a function of halo mass (from results presented in [78] ) . Different panels show the 
same cosmological models and power spectra as Figure 6. 

Functions of the form 

p(r\m) — -—. — —-^ ; — rj; or p(r\m) — — -. — . r f s . . — r-jr , (74) 

Hy 1 ; (r/r a ) a (l + r/r a )P IKl 1 (r/r a ) a [l + {r/r s Y} v 1 



have been extensively studied as models of elliptical galaxies [111,300]. Setting 
(a, (3) = (1, 3) and (1, 2) in the expression on the left gives the Hernquist [111] 
and NFW [198] profiles, whereas (a, (3) = (3/2,3/2) in the expression on the 
right is the M99 profile [195]. 

The NFW and M99 profiles provide very good descriptions of the density run 
around virialized halos in numerical simulations (figure 6). The two profiles 
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Fig. 8. Distribution of concentrations at fixed mass for dark matter halos fit to 
NFW profiles. Different symbols show results for halos in different mass bins. When 
normalized by the mean concentration in the bin, the distribution is well described 
by a log-normal function (equation 77). The pile up of halos at small values of the 
concentration is due to numerical resolution of the GIF simulations. 

differ on small scales, r <C r s , and whether one provides a better description 
of the simulations than the other is still being hotly debated. Both profiles are 
parameterized by r s and p s , which define a scale radius and the density at that 
radius, respectively. Although they appear to provide a two-parameter fit, in 
practice, one finds an object of given mass m and radius r vir in the simulations, 
and then finds that r s which provides the best fit to the density run. This is 
because the edge of the object is its virial radius r vir , while the combination 
of r s and the mass determines the characteristic density, p s , following 

"vir 

m = J dr Anr 2 p(r\m) . (75) 
o 



For the NFW and M99 profiles, 



m = 4-Kp s r" s 



ln(l + c) 



1 + c 



, 3 21n(l + c 3 / 2 ) M . 
and m = Anp s r s , (7b ) 
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where c = r V i r /r s is known as the concentration parameter. Note that we 
have explicitly assumed that the halo profile is truncated at r vir , even though 
formally, the NFW and M99 profiles extend to infinity. Because these profiles 
fall as r~ 3 at large radii, the mass within them diverges logarithmically. Our 
decision to truncate the profile at the virial radius insures that the mass within 
the profile is the same as that which is described by the halo mass function 
discussed previously. 



Since most of the mass is at radii much smaller than r vir , the fitted value of r s 
is not very sensitive to the exact choice of the boundary r vir . The simulations 
show that for halos of the same mass, there is a distribution of concentrations 

c 



r vir /r s which is well-fit by a log-normal distribution [138,31]: 



dine 

p(c\m, z) dc = jz exp 



In 2 [c/c(m, z)\ 



(77) 



Although the mean concentration c(m, z) depends on halo mass, the width of 
the distribution does not. This is shown in Figure 8, which is taken from [256]. 
The Figure shows that the distribution of c/c is indeed well approximated by 
a log-normal function. 



For the NFW profile, 



c(to, z) 



l + z 



m 



m*(z) 



-0.13 



and 



Clnt 



0.25, 



(78) 



where m*(z) is characteristic mass scale at which is(m,z) = 1. A useful ap- 
proximation, due to [212], is that c[M99] « (c[NFW]/1.7) - 9 . Equation (78) 
quantifies the tendency for low mass halos to be more centrally concentrated, 
on average, than massive halos. 



In what follows, it will be useful to have expressions for the normalized Fourier 
transform of the dark matter distribution within a halo of mass to: 



u(k\m) 



fd 3 xp( 



x to e 



-ikx 



/ <i 3 xp(x|m) 



(79) 



For spherically symmetric profiles truncated at the virial radius, this becomes 



u{k\m) 



dr 4nr 2 



sin kr p(r\m) 
kr to 



(80) 



Table 1 contains some p(r\m) and u(k\m) pairs which will be useful in what 
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follows. 



For the NFW profile, 



u{k\m) 



4vrp s rf 



m 




where the sine and cosine integrals are 



oo 



X 



Ci(s) = - f 



COS t 



dt and Si(x) = J 
o 



sint 



dt. 



(82) 



t 



X 



Figure 9 shows u(k\m) as a function of m for NFW halos. In general, the shape 
of the Fourier transform depends both on the halo concentration parameter, 
c, and the mass m. The figure shows a trend which is to all the profiles in 
Table 1: the small scale power is dominated by low mass halos. 

There are no complete explanations for why the NFW or M99 profiles fit 
the dark matter density distribution of dark matter in numerical simulations, 
although there are reasonably successful models of why the concentrations 
depend on mass [198,205,288]. In the present context, the reason why they fit 
is of secondary importance; what is important is that these fits provide simple 
descriptions of the density run around a halo. In particular, what is important 
is that the density run around a dark matter halo depends mainly on its mass; 
though the density profile also depends on the concentration, the distribution 
of concentrations is determined by the mass. 



4 Halos and large scale structure 

At this point, we have formulae for the abundance and spatial distribution of 
halos, as well as for the typical density run around a halo. This means that we 
are now in a position to construct the halo model. The treatment below will 
be completely general. To make the model quantitative, one simply inserts 
their favorite formulae for the halo profile, abundance and clustering (such as 
those presented in the previous section) into the expressions below. 

The formalism written down by Neyman & Scott [199], which we are now in 
a position to consider in detail, had three drawbacks. First, it was phrased 
entirely in terms of discrete statistics; some work is required to translate it 
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Fig. 9. Fourier transforms of normalized NFW profiles u(k\m), for a variety of 
choices of halo mass, at the present time (redshift z = 0). Equation (78) for the halo 
mass-concentration relation has been used. The curves show that the most massive 
halos contribute to the total power only at the largest scales, whereas smaller halos 
contribute power even at small scales. 

into the language of continuous density fields. Second, it was phrased entirely 
in terms of real coordinate space quantities. As we will see shortly, many of 
the formulae in the model involve convolutions which are considerably easier 
to perform in Fourier space. And, finally, the particular model they assumed 
for the clustering of halos was not very realistic. 

Scherrer & Bertschinger [232] appear to have been the first to write the model 
for a continuous density field, using Fourier space quantities, in a formulation 
which allows one to incorporate more general and realistic halo-halo correla- 
tions into the model. It is this formulation which we describe below. 

4-1 The two-point correlation function 

In the model, all mass is bound up into halos which have a range of masses 
and density profiles. Therefore, the density at position x is given by summing 
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Table 1 

Density profiles and associated normalized Fourier transforms. Distances are in units 
of the scale radius: x = r/r s , c = r v i r /r s and k = kr s , and, when truncated, the 
boundary of the halo is v v i r . The sine and cosine integrals are defined in the main 
text. 



p(x) 


range 


u(k) 


(27r)- 3 / 2 exp(-x 2 /2) 




exp(-K 2 /2) 


exp(— x)/87r 




(1 + ^ 2 )- 2 


exp(-x)/(47rx 2 ) 




atan(/c)//c 


x- 2 (l+a; 2 )- 1 /(2vr 2 ) 




[1 — exp(— k)]/k 


3/(4vrc 3 ) 


x < c 


3 [sin(cfc) — ck cos(ck)]/(ck) 3 


(47TCX 2 ) -1 


x < c 


Si(ck)/ck 


x-^l+x)- 2 


x < c 


Equation (81) 



up the contribution from each halo: 

p(x) = ^/i(x-x;) = ^2 p(x - Xi\rrii) =J2 m i «(x - x^m*) 

it i 

— / dm d 3 x' S(m — mi) <5 3 (x' — Xj) mu(x. — x'|m), (83) 
i J 

where /j denotes the density profile of the ith halo which is assumed to be 
centered at Xj. The second equality follows from assuming that the density run 
around a halo depends only on its mass; this profile shape is parameterized 
by p, which depends on the distance from the halo center and the mass of the 
halo. The third equality defines the normalized profile u, which is p divided 
by the total mass contained in the profile: / <i 3 x' u(x. — x'|m) = 1. 

The number density of halos of mass m is 

V 5(m - rm) 5 3 (x' - x,)) = n(m), (84) 



where (...) denotes an ensemble average. The mean density is 

= J dmn(m)m, (85) 
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where the ensemble average has been replaced by an average over the halo 
mass function n{m) and an average over space. 

The two-point correlation function is 

£(x - x') = e lft (x - x') + - x') (86) 

where 

dm -± — - J d 3 y u(y\m) u(y + x - x'|m) 

r(x-xQ = /^ 1 mi ^ (mi) Jdm ^^ /^ XlM(x _ XlK ) 

x J d 3 x 2 w(x' - x 2 |m 2 ) Cm(xi - x 2 |mi,m 2 ) ; 

the first term describes the case in which the two contributions to the density 
are from the same halo, and the second term represents the case in which 
the two contributions are from different halos. Both terms require knowledge 
of how the halo abundance and density profile depend on mass. The second 
term also requires knowledge of £m( x — x'|mi,m 2 ), the two-point correlation 
function of halos of mass m\ and m 2 . 

The first term is relatively straightforward to compute: it is just the convo- 
lution of two similar profiles of shape u(r\m), weighted by the total number 
density of pairs contributed by halos of mass m. This term was studied in the 
1970's, before numerical simulations had provided accurate models of the halo 
abundances and density profiles [215,184]. The more realistic values of these 
inputs were first used to model this term some twenty years later [250] . 

The second term is more complicated. If u\ and u 2 were extremely sharply 
peaked, then we could replace them with delta functions; the integrals over 
xi and x 2 would yield £m(x — x'|mi, m 2 ). Writing xi — x 2 = (x — x') + (x' — 
x 2 ) — (x — Xi), shows that this should also be a reasonable approximation if 
£wi( r l m i> twi) varies slowly on scales which are larger than the typical extent 
of a halo. Following the discussion in the previous section with respect to halo 
bias, on large scales where biasing is deterministic, 

ihh(j-\m l ,m 2 ) « 6(mi) b(m 2 ) f (r) (87) 

Now £(r) can be taken outside of the integrals over mi and m 2 , making the 
two integrals separable. The consistency relations (equation 71) show that 
each integral equals unity. Thus, on scales which are much larger than the 
typical halo £ 2/l (r) ps £( r )- However, on large scales, £(r) ~ £ hn (r), and so the 
two-halo term is really very simple: i 2h (r) ps £ lin (r). 
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Setting £71/1(7" | mi, m 2 ) ~ 6(mi) 5(m 2 ) £(r) will overestimate the correct value 
on intermediate scales. Furthermore, on small scales the halo-halo correlation 
function must eventually turn over (halos are spatially exclusive — so each halo 
is like a small hard sphere); assuming that it scales like £(r) is a gross overes- 
timate. Using £wi(r|rai, m 2 ) ~ b(mi) b{m 2 ) £ lin (r), i.e., using the linear, rather 
than the non-linear correlation function, even on the smallest scales, is a crude 
but convenient way of accounting for this overestimate. Although the results 
of [252] allow one to account for this more precisely, it turns out that great 
accuracy is not really needed since, on small scales, the correlation function is 
determined almost entirely by the one-halo term anyway. Although almost all 
work to date uses this approximation, it is important to bear in mind that it's 
form is motivated primarily by convenience. For example, if volume exclusion 
effects are only important on very small scales, then setting £(r) ~ £ 1_1o °p(7") 
rather than £ lin (r), i.e., using the one-loop perturbation theory approximation 
rather than the simpler linear theory estimate, may provide a better approxi- 
mation. 

Because the model correlation function involves convolutions, it is much easier 
to work in Fourier space: the convolutions of the real-space density profiles 
become simple multiplications of the Fourier transforms of the halo profiles. 
Thus, we can write the dark matter power spectrum as 



P(k)=P lh (k) + P 2h (k), where 
P lh (k)= J dmn(m) (^jj \u(k\m)\ 2 

P 2h (k) = J dm 1 n(m l ) I — J w(jfe|77ii) 



/ 



P 

dm 2 n(m 2 ) u ( k \ m 2) Phh(k\m l7 m 2 ) . 



(88) 



Here, u(k\m) is the Fourier transform of the dark matter distribution within 
a halo of mass m (equation 80) and Phh(k\mi, m 2 ) represents the power spec- 
trum of halos of mass mi and m 2 . Following the discussion of the halo-halo 
correlation function (equation 87), we approximate this by 



P hh (k\m 1 ,m 2 ) « J] bi(mi)P lin (k) (89) 
i=i 



bearing in mind that the one-loop perturbation theory estimate may be more 
accurate than P hn (k). 
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4-2 Higher- order correlations 



Expressions for the higher order correlations may be derived similarly. How- 
ever, they involve multiple convolutions of halo profiles. This is why it is much 
easier to work in Fourier space: the convolutions of the real-space density 
profiles become simple multiplications of the Fourier transforms of the halo 
profiles. Similarly, the three-point and four-point correlations include terms 
which describe the three and four point halo power spectra. The bi- and tri- 
spectra of the halos are 



B hhh (k 1: k 2 , k 3 ; mi, m 2 , m 3 ) = ]J b^rrii) 



bi(m 3 y 



T h hhh(ki, k 2 , k 3 , k 4 ; m 1 , m 2 , m 3 , m 4 ) = \\h 



rrii 



i=i 



b ^l P ^ kl )P^ k2 )P^ h ) 



bi(m 4 ) 



(90) 



Notice that these require the power, bi- and trispectra of the mass, as well as 
mass-dependent ith-order bias coefficients bi(m). Whereas P, B and T come 
from perturbation theory (§ 2.2), the bias coefficients are from the non-linear 
spherical or ellipsoidal collapse models and are given in § 3.3. 

Using this information, we can write the dark matter bispectrum as 



B l23 = B lh + B zn + B M , 



2h 



?3h 



(91) 



where, 



3 3 



B ih = fdmn(m) (jrj f[u(h\ 



mi 



m) 



B 2h =[J 

xP hh (k 1 \m 1 ,m 2 ) + eye. 



dmin{mi) ( — I u(ki\mi) 
P 



dm 2 n{m 2 ) \ ~J u(k 2 \m 2 )u(k 3 \m 2 ) 



B 



3h 



Yl J dmiu(ki\mi)n{mi) {^j- 



B hhh (m 1: m 2 ,m 3 ) , 



(92) 
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where P^;l( m i> 1712, m 3 ) = Bhhh(ki, k 2 , k 3 \mi, rn 2 , "^3) and denotes the bispec- 
trum of halos of mass m 1 ,m 2 and m 3 . 

Finally, the connected part of the trispectrum can be written as the sum of 
four terms 



rp rplh _j_ rp2h _j_ rpSh _j_ rp4:h 



(93) 



where 



(\ 4 4 
?j f[u(ki\m) 

J dmin(mi) (— -J m(A;i|toi) ^ dm 2 n{m 2 ) ( — — 



+ 



P / VP 

xu(fc 2 |m2)u(fc3|ra2)u(fc4|m2)Pfch(fci|mi,m 2 ) + c y c - 
dm 1 n(m 1 ) [ — J ■u(/c 1 |m 1 )-u(/c2|m2) ^ rfm 2 n(m 2 ) ( ^ ) u(/c 3 |m 2 )w(A;4|m2) 



P 



P 



/ 

xP M (|ki + k 2 ||mi,m 2 ) + cyc. 
T 3/l = y dmin{mi) u{ki\mi) j dm 2 n{m 2 ) \ ~~~~~ \ u (k 2 \m 2 ) 



x J dm 3 n(m 3 ) i — j u(k 3 \m 3 )u(k4\m 3 )B hhh (k 1 , k 2 , k 3 + k 4 |mi, m 2 , m 3 ) 



4/1 



i=i 



] [ / dmiu(ki\mi)n(mi) 



rrii 



T hhhh {m l ,m 2) m 3) m i ). 



(94) 



For simplicity, we reduce the notation related to integrals over the Fourier 
transform of halo profiles and write the power spectrum, 



P(k) = P lh (k) + P 2h (k) 
P lh (k) = M 02 (k,k) 
p2h = p 3m(Q [M n (k)} 2 , 

bispectrum, 



(95) 



B 123 = B lh + B 2h + B 3h 

B lh = M 03 (k u k 2 ,k 3 ) 

B 2h = M 11 (A; 1 )M 12 (A; 2 , A; 3 )P lin (A; 1 ) + cyc 



40 



B 3h = 



n M n(^) 



+ M 11 (k 1 )M n (k 2 )M 21 (k 3 )P lm (k 1 )P lm (k 2 ) + eye. , 

(96) 



and trispectrum, 



rp rplh _|_ rp2fl _|_ rp3h _|_ rpAh 

T lh = M 04 {k u k 2 ,k 3 ,k A ) 
T 2h = [M 11 (A; 1 )M 1 3(A; 2 , k 3 , fc 4 )P lin (A; 1 ) + eye. 

+ [M 12 (fci, A; 2 )M 12 (A;3 ! A; 4 )P lin (|k 1 + k 2 |) + eye. 

T 3h = M u (k 1 )Mn(k 2 )M 12 (k 3 ,k 4 )B^(k 1 ,k 2 M + ^) 
+ [M ll (k l )M 11 (k 2 )M 22 (k 3 ,k 4 )P lh \k 1 )P l ^(k 2 ) + eye. 



T 



Ah 



n M n(^) 



T-ilin 
1 1234 



+M 11 (A; 1 )M 11 (A; 2 )M 11 (A;3)M 21 (A; 4 )P lin (A;i)P lin (A; 2 )P lin (A;3) + eye. . (97) 
Here, bo = 1 and 



Mij(ki, . . . ,kj) = J dmn( 



) I — J 6j(m)[u(A;i|m) . . . u(kj\m)] , (9£ 



with the three-dimensional Fourier transform of the halo density distribution, 
u(k\m), following equation (80). 

The one-point moments, smoothed on scale R, can also be obtained by an 
integral with the appropriate window function W(kR). In the case of variance, 



a 2 (R) 



k 2 dk 

k 2 dk 
~2^~ 



P(k)\W(kR)\< 



P li «(k) [M 11 (k)] z \W(kR)\ 2 + 



k 2 dk 



M 02 (k,k)\W(kR)\ : 



■(T? in (R) + J dmn(m) l — j u 2 (R\m) 



(99) 



where 



u n (R\m) 



k 2 dk 
~2^~ 



u n {k\m)\W{kR)\< 



(100) 



In simplifying, we have written the fully non-linear power spectrum of the den- 
sity field in terms of the halo model (equation 95) and taken the large-scale 
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limit Mn ~ 1- This is a reasonable approximation because of the consistency 
conditions (equation 71). Here, af in (K) follows from equation (58). With sim- 
ilar approximations, we can derive higher-order connect moments (see [236] 
for details). 

With the same general integral defined in [236], 

Aij(R) = [dmn(m)(—\ bi(m)v?(R\rn)vJ(R\rn) , (101) 



we can write the one point moments as 



(5 2 ) =a 2 = al n + A 00 

^ 3 ) = ^3 in < + 3^ n Aio + A i 

Clin 4 2 

clin o clin _2 

(5 5 ) c = S^al + lO-l-^Axo + 25^-4^1 + 15^Ai 2 + A 03 , 
lb o o 

(102) 

where the terms in (S n ) c are ordered from n-halo to 1-halo contributions. The 
coefficient of an m-halo contribution to (S n ) c is given by s(n,m) (e.g. 6 and 7 
in the second and third terms of equation 102), the Stirling number of second 
kind, which is the number of ways of putting n distinguishable objects (S) into 
m cells (halos), with no cells empty [232]. 

In general, we can write the nth moment as 

n-l 

(Oc = Sr^L^ + E ®nmS^ T + A)„-2, (103) 

m=2 



where the first term in equation (103) represents the n— halo term, the second 
term is the contribution from m-halo terms, and the last term is the 1-halo 
term. The coefficients a nm measure how many of the terms contribute as 
Ain-m-i, with the other contributions being subdominant. For example, in 
equation (102), the 2-halo term has a total contribution of 7 terms, 4 of them 
contain 3 particles in one halo and 1 in the other, and 3 of them contain 2 
particles in each. The factor 4/7 is included to take into account that the 3 — 1 
amplitude dominates over the 2 — 2 amplitude. Note that in these results, we 
neglected all contributions from the non-linear biasing parameters in view of 
the consistency conditions given in equation (71). The S^ T were defined in 
equation (41). 
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4-3 An illustrative analytic example 



To introduce the general behavior of the halo based predictions, we first con- 
sider a simple illustrative example. We assume that the initial spectrum of the 
density fluctuation field is Po(k) = A/k 3 ^ 2 . This is not a bad approximation 
to the shape of the power spectrum on cluster-like scales in CDM models. If 
we set Aq(/c) = k 3 P (k) / (2n 2 ) then the variance on scale R is 

a 2 {R) = J fAl(k) WUkR) = R^ 2 - (104) 



Setting a(R*) = 5 SC means 



Al(k) = ^l(kR^. (105) 



We will approximate the fraction of mass in virialized halos of mass m using 
equation (57): 



f(m) dm 



mn(m) dm dv fV 



-^ex Pl -- 



where v = 5 2 c /a 2 (m) = (m/m*) 1 / 2 and m* = AnRlp/3. We will assume that 
the density run around the center of a virialized halo scales as 



p(r\m) 2A n i 3 y 2 

= t; — c (m) — 

p 3tt v 1 1 + y 



c 3 (m) T ^-^, (106) 



when p is the background density, y = r/r s , c = r vir /r s , A n \ = (R/r vir ) 3 , and 
m/p = 47ri? 3 /3. Here R is the initial size of the halo, r vir is the virial size, and 
r s is the core radius. Since the profile falls more steeply than r~ 3 at large r, 
the total mass is finite: 4-7T / dr r 2 p(r\m) = m. We will assume that the core 
radius depends on halo mass: c{m) = c* (m,/m) 7 . 

The normalized Fourier transform of this profile is 



J dr r 2 p(r\m) sm(kr)/(kr) 1 — e kr 

U [ K\ 771 ) = 



Note that at large k, u(k\m) decreases as 1/k. 
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If we set 7 = 1/6 (so more massive halos are less concentrated), then kr s = kis, 
and the integrals over the mass function which define the power spectrum can 
be done analytically. For example, the contribution to the power from particles 
which are in the same halo is 



A 2 lh (k) = j dm ^ \u(k\m) 2 \ 



2 A 



nl 3 



37T 



cZk\1 + 



1 



and the contribution from pairs in separate halos is 

Al h (k) = B 2 (k) Al(k) = B 2 (k) ^| {A^f 2 Jl\ 



where 



B(k) = - 



2 , / 1 

^SC ^ ^sc 



1/4 



V1 + 2k 



Here = / dm [mn(m)/p] 6(m)u(fc|m) and we have used the fact that, if 
the mass function is given by equation (57), then b(m) = 1 + {v — 1)/5 SC from 
equation (66). 

At small k, the one-halo term is 2A n i c^/ti times k 3 , whereas the two-halo term 
is Aq(/c) times B 2 (k) — > 1 — (2/5 sc + l)/t; the effect is to multiply the linear 
spectrum by a A; dependent factor which is less than unity. Thus, at small k 
most of the power comes from the two-halo term. At large k, the two-halo 
term is 2(1 — 1/5 sc ) 2 /k, times the linear spectrum, so it grows as k 1 / 2 . On the 
other hand, the one-halo term is 2A nl c 3 /37r times k. Thus, the power on small 
scales is dominated by the one-halo term. 

If, on the other hand, 7 = —1/3 (so more massive halos are more concentrated, 
unlike numerically simulated halos), then k = kr s = kR* / '(c*A^{ 3 ) is indepen- 
dent of m, and so u(k\m) is also independent of m. Since / dm (m/ p) 2 n(m) = 
3 (m*/p), the two power spectrum terms are 



,2 (u\ _ 2A n i 3 3 / 1 e K \ ,2 - '1 e 



— K 



2 



Af h (k) = -^4^ and A 2 h (k) = [^— ^o(k)- 



This shows how the variation of the central concentration with halo mass 
changes the contribution to the total power from the two terms. In addition, 
changing the halo mass function would obviously change the final answer. And, 
for 7 = —1/3, changing the initial power spectrum only changes the prefactor 
in front of A\ h (k). The prefactor in front of A| 7l (A;) is unaffected, though, of 
course, Aq(/c) has been changed. 
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4-4 Compensated profiles 



This section considers density profiles which are 'compensated'; these are com- 
binations of over- and under-dense perturbations, normalized so that the mass 
in each of the two components is the same. The reason for considering such 
profiles is to illustrate a curious feature of the halo model: when only positive 
perturbations are present, then, as k — > 0, the single halo contribution to the 
power tends to a constant: 



P lh (k -> 0) -> J dmn{m) (108) 



In CDM-like spectra, the linear power-spectrum is oc k at small k, so that the 
single halo term eventually dominates the power. This problem is also present 
in the higher order statistics such as the bi- and trispectra. For the power 
spectrum, this constant is like a mean square halo mass, so that this excess 
large scale power resembles a shot-noise like contribution. This suggests that 
it must be subtracted-off by hand. However, subtracting the same constant at 
all k is not a completely satisfactory solution, because the power at sufficiently 
large k can be very small, in which case subtracting off P lh (k = 0) might lead 
to negative power at large k. The compensated profile model is designed so 
that the one-halo term is well behaved at small k. On the other hand, as we 
show, compensated models suffer from another problem: they have no power 
on large scales! 

Consider the correlation function which arises from a random, Poisson, distri- 
bution of density perturbations, in which all perturbations are assumed to be 
identical. We will consider what happens when we allow perturbations to have 
a range of sizes later, and correlations between perturbations will be included 
last. The density at a distance r from a compensated perturbation can be 
written as the sum of two terms: 

p(r) = p + (r)+p_(r). (109) 



So that we have a concrete model to work with, we will assume that 



where p denotes the mean density of the background in which these pertur- 
bations are embedded. (The Gaussian is a convenient choice because it is a 
monotonic function for which the necessary integrals are simple.) We will re- 
quire a > 1, so that the positive perturbation is denser than the background. 
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We will discuss the scales a + and er_ of the two perturbations shortly. For 
now, note that the negative perturbation is bounded between zero and one: 
p~(x) is always less than the mean density. The reason for this is that we 
are imagining that the perturbation can be thought of as an initially uniform 
density region of size <r_ from which mass has been scooped out according to 
P-(x), and replaced by mass which is distributed as p+(x). The total density 
fluctuation is 

i(x) S ^-l = .«p(^)-«p(^). (Ill) 



The integral of 5 over all space is 

oo 

g = 4tt J dx x 2 5(x) = (2tt) 3/2 [a a% - a 3 _) , (112) 



and depends on the amplitudes and scales of the positive and negative per- 
turbations. 

If we set 

a_ = a 1/3 a + (113) 



then g = 0. This corresponds to the statement that the positive perturbation 
contributes exactly the same amount of mass which the negative perturbation 
removed. The only difference is that the mass has been redistributed into the 
form p+(x). It is in this sense that the profiles are compensated. 

We can build a toy model of evolution from this by defining a + (t)/a- = R(t). 
We will imagine that, at some initial time, R(t) ps 1, and that it decreases 
thereafter. This is supposed to represent the fact that gravity is an attractive 
force, so the mass which was initially contained with cr_ is later contained 
within the smaller region er + . If mass is conserved as a + shrinks (equation 113), 
then it must be that a(t) = Thus, the amplitude a is related to the 

ratio of the initial and final sizes of er + . The fact that R(t) 1 initially reflects 
the assumption that the initial density field was uniform. Today, the mass 
is in dense clumps — the positive perturbations. Each positive perturbation 
assembled its mass from a larger region in the initial conditions. Our particular 
choice of setting g to zero comes from requiring that all the mass in the positive 
perturbation came from the negative one. 

Note that we haven't yet specified the exact form for the evolution of the 
profile, R(t). Independent of this evolution, we can use the formalism in [183] 
to compute the correlation functions as a function of given t. The correlation 
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function is the number density of profiles, r], times the convolution of such 
a profile with itself, A(r). Since all the mass is in the positive perturbations, 
and each positive perturbation contains mass m = (27r) 3 / 2 p a 3 , we can set 
i] = p/m. In our compensated halo model, A is the sum of three terms: 

A(r) = A ++ (r) + A__(r) - 2A + _(r), (114) 



where A ++ ,A , and A + _, denote the various types of convolutions. For the 

Gaussian profiles we are considering here, 



A ++ (r) = a 2 vr 3 / 2 a\ exp , A__(r) = vr 3 / 2 a 3 _ exp , 

3/2 

A + .(r) = a^(^l expl-^]. (115) 



2a} ff 2 \ 3/2 / -r 2 /2 



Inserting equation (113) makes the factors in front of the exponentials resemble 
each other more closely. 

It is a simple matter to verify that these compensated profiles satisfy the 
integral constraint: 

oo 

r] An J dr r 2 A(r) = 0. (116) 



If the correlation function were always positive, this integral constraint would 
not be satisfied. Because of the minus sign in equation (114) above, the cor- 
relation function in compensated halo models can be negative on large scales. 

The power spectrum is obtained by Fourier-transforming the correlation func- 
tion. Since the Fourier transform of a Gaussian is a Gaussian, equation (115) 
shows that, in these compensated models, the power spectrum is the sum of 
three Gaussians: 



P(k) — — - J dr r 2 r)\(r 



sin kr 



kr 



G 



e fc M + e" 



ArV _ 2e -fe 2 ^+/2 e -fc 2 CT 2/2 



)■ 



;ii7) 



The form of this expression is easily understood, since convolutions in real 
space are multiplications in Fourier space such that the power spectrum is 
simply a sum of products of Gaussians. Indeed, if we let U(k) and W(k) denote 
the Fourier transforms of the positive and negative perturbations, then 



P(k) = U{k) - W(k) 



(118) 
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Fig. 10. Density profiles and correlation functions associated with uncompensated 
(dashed) and compensated (solid) Gaussian perturbations. The correlation function 
A is negative for r/a + in the range 3 — 15 or so, so we have plotted |A| instead. 

It is interesting to compare this expression with the case of a positive per- 
turbation only. In this case, A = A ++ , and P(k) = U(k) 2 . Since A ++ > 
always, such a model does not satisfy the integral constraint. Analogously, 
uncompensated profiles have P(k) — > constant at small k. If one thinks of the 
compensated profile as providing a correction factor to the power spectrum of 
positive perturbations, then the expression above shows that this correction 
is k dependent: simply subtracting-off a constant term from U(k) is incor- 
rect. In the compensated Gaussian model above, W(k) — > at large k, so 
P(k) — > U(k) on small scales. However, P(k) — > at small k — there is no 
power on large scales. 



Fig. 10 shows all this explicitly. The panels show density profiles, correlation 
functions and power spectra for uncompensated (dashed curves) and compen- 
sated (solid curves) Gaussian perturbations which have a = 200 and a + — 1. 
Notice how, for compensated profiles, the correlation function oscillates about 
zero. Notice also how P(k) for the two cases tends to very different limits at 
small k. 

So far we have assumed that all profiles had the same shape, parameterized 
by <t_. Because mcca!, allowing for a range of masses is the same as allowing 
for a range of profile shapes. Thus, 

P(k) = j dmn(m)P(k\m), (119) 
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where n(m) is the number density of perturbations which have mass m, and 
P(k\m) is the power spectrum for perturbations which contain this mass. 
Since each of the P(k\m)s tends to zero at small k, this will also happen for 
P(k). The shape of n(m) depends on the initial spectrum of fluctuations. If 
we insert the shape of n{m) associated with an initial P(k) oc 1/k spectrum, 
and use the Gaussian profiles above, then the integral over m can be done 
analytically. Using, S 2 c /a 2 (m) = (m/m*) 2 / 3 = p, and er* to denote <7_ for an 
m* halo, m* = (2ir) 3 ^ 2 pa 3 , we get 



k 3 P(k)=k 3 jdmn(m) (jj (e^ + e^- - 2 e^'V^-/ 2 ) 

= / k 3 mA r d;/ 3/2+1/2 exp(-///2) 
V p ) J P V2n 

3 / 1 1 2 \ 

~ 87rK v [1 + 2/t 2 /a 2 /3]2 + [1 + 2k 2 ] 2 ~ [1 + /t 2 + /t 2 /a 2 /3] 2 J (120) 
where we have set k = ka*. 

This spectrum is different from the one in which all halos had the same mass 
(equation 117). The power associated with any given halo mass falls expo- 
nentially at large k; the result of adding up the contributions from all halos 
means that P(k) only decreases as k~ 4 at large k. This is a consequence of the 
fact that the less massive halos are smaller and much more numerous than 
the massive halos. 

We can also work out these relations for tophat perturbations. Here, 



5( r ) = A-1 if 0<r<R + 

= -1 if R + < r < R_, (121) 

and it equals zero for all r > If we require the mass in the positive 
perturbation cancel the mass in the negative one, then A = (R^/R + ) 3 . The 
various convolution integrals that should be substituted in equation (114) are 



X ++ (r) = A- 



AnRl 



4R+ 16 I R + , 



if < r < 2R 



+ 



A__(r) 



4ttR 3 



\+-(r)=A 



3 

AttR 3 



3 r 1 f r 
~ AR~ + 16 [~R- 



if < r < (R_ - R + ) 



if < r < 2iL 
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= ^(R- +R+- r) 2 (r 2 + 2r(i?_ + R+) - 3(i?_ - R + ) 2 ) (122) 

when (R_ — R + ) < r < (R + + itL). It is now straightforward to verify that 
the resulting expression for A(r) satisfies the integral constraint given in equa- 
tion (116). 

Allowing for a range of profile shapes means that £(r) = / dm p(m) r)\(r\m), 
where m parameterizes the profile shape, and p(m) is the probability that 
a perturbation had shape m. Note that in most models of current interest, 
the profile shape is a function of the mass contained in the halo. Since each 
of the X(r\m)s satisfies the integral constraint, £(r) will also. Similarly, the 
contribution to P{k) at small k will be zero. 

Thus, in contrast to positive perturbations, compensated profiles satisfy the 
integral constraint on the correlation function, and have vanishing power at 
small k. Both these are physically desirable improvements on the positive 
perturbation alone model. 

The model with only positive perturbations is the only one which has been 
studied in the literature to date. One consequence of this is that, in these 
models, P lh (k) — > / dmm 2 n(m)/ p 2 ^ as k — > 0. Since P 2h (k) tends to the 
linear perturbation theory value in this limit, the sum of the two terms is 
actually inconsistent with linear theory on the largest scales. The discrepancy 
is small in models of the dark matter distribution, but, for rare objects, the 
shot-noise-like contribution from the P lh (k) term can be large [244]. How 
to treat this discrepancy is an open question [236]. One might have thought 
that compensating the profiles provides a natural way to correct for this. 
Unfortunately, compensated profiles are constructed so that u(k\m) — > at 
small k. Since P 2h (k) also depends on u(k\m), in compensated models it too 
tends to zero at small k. Therefore, whereas uncompensated profiles lead to 
a little too much power on large scales, compensating the profiles leads to no 
large scale power at all! (The physical reason for this is clear: because they 
are compensated, the total mass in the profiles integrates to zero. Therefore, 
the profiles represent only local rearrangements of the mass; on scales larger 
than the typical perturbation, this rearrangement can be ignored — hence, the 
models have no large scale power.) This drawback of the model which should 
be borne in mind when making predictions about the power on large scales. 



5 Dark Matter Power Spectrum, Bispectrum and Trispectrum 

We will now discuss results related to the dark matter distribution. We show 
how the power spectrum is constructed under the halo model, discuss some 
aspects of higher order clustering, and include a calculation of correlations in 
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Fig. 11. Power spectrum of the dark matter density field at the present time. Curve 
labeled 'PD' shows the fitting formula of [210]. Dot dashed curve labeled 'lin' shows 
the linear P hn (k). Dotted and short dashed curves show the two terms which sum 
to give the total power (solid line) in the halo model. 

estimates of the power spectrum. We conclude this section with a discussion 
of the extent to which the halo model can be used as a astrophysical and 
cosmological tool, and suggest some ways in which the model can be extended. 



5. 1 Power Spectrum 



Figure 11 shows the power spectrum of the dark matter density field at the 
present time [z = 0). Dotted and short dashed lines show the contributions 
to the power from the single and two halo terms. Their sum (solid) should be 
compared to the power spectrum measured in numerical simulations, repre- 
sented here by the dashed curve labeled 'PD' which shows the fitting function 
of equation (46). (At the largest k shown, this fitting function represents an 
extrapolation well beyond what has actually been measured in simulations to 
date, so it may not be reliable.) In computing the halo model curves we have 
included the effect of the scatter in the halo concentrations (equation 78). 
Although ignoring the scatter is actually a rather good approximation, for 
precise calculations, the scatter is important, especially for statistics which 
are dominated by massive halos. 

In general, the linear portion of the dark matter power spectrum, k < O.lh 
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Fig. 12. The (a) equilateral bispectrum and (b) square trispectrum of the dark 
matter in the halo model. Solid lines show the total bispectrum and trispectrum, 
and the different line styles show the different contributions to the total. 

Mpc -1 , results from the correlations between dark matter halos and reflects the 
halo-mass dependent bias prescription. The spherical or ellipsoidal collapse 
based models described previously describe this regime reasonably well at all 
redshifts. At k ~ 0.1 — lh Mpc -1 , the one- and two-halo terms are comparable; 
on these scales, the power comes primarily from halos more massive than M*. 
At higher fc's, the power comes mainly from individual halos with masses 
below M*. 

The small scale behavior of the power spectrum is sensitive to assumptions 
we make with regarding the halo profile. If we change the shape of the density 
profile, e.g., from NFW to M99, then P(k) will change. However, if we also 
modify the mean mass-concentration relation, then the difference between the 
two P(k)s can be reduced substantially. We discuss the effect of allowing a 
distribution p(c\m) of concentrations at fixed mass (i.e., allowing some scatter 
around the mean mass-concentration relation) at the end of this section. 

5.2 Bispectrum and trispectrum 

Figures 12(a) and (b) show the bispectrum and trispectrum of the density 
fluctuation field at z — 0. Since the bispectrum and trispectrum depend 
on the shape of the triangle and quadrilateral, respectively, the figure is for 
configurations which are equilateral triangles and squares. Since the power 
spectra and equilateral bispectra share similar features, it is more instruc- 
tive to study Qeq(k), denned by equation (32). Figure 13(a) compares the 
halo model estimate of Q eq with the second order perturbation theory (PT) 
and HEPT predictions (equations 34 and 49). In the halo prescription, Q cq 
at k > lO&nonim ~ 10/j Mpc -1 arises mainly from the single halo term. Fig- 
ure 13(a) also shows the fitting function for Q eq (k) from [239], which is based 
on simulations in the range O.lh < k < 3h Mpc -1 . This function is designed to 
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Fig. 13. (a) Qeq(k) and (b) Q sq at z = 0. Different lines styles show the different 
contributions to the total (bold dashed) in the halo model description. Thin solid 
lines show the second order perturbation theory (PT) and HEPT values. In (a), the 
thick solid line shows the fitting formula for Q eq from [239]. Notice that on linear 
scales, the halo model prediction is about twenty percent larger than the PT value 
in (a), and about a factor of two larger than the PT value in (b). 



converge to the HEPT value at small scales and the PT value at large scales. 
Notice that the HEPT prediction is considerably smaller than the halo model 
prediction on small scales. 

Figure 14 (from [236]) compares the predicted Q eq s with measurements in 
numerical simulations. To the resolution of the simulations, the data are con- 
sistent with perturbation theory at the largest scales, and with HEPT in the 
non-linear regime. The halo model predictions based on the two mass function 
choices (Press-Schechter and Sheth-Tormen) generally bracket the numerical 
simulation results, assuming the same halo profile and concentration-mass re- 
lations are the same in both cases. The most massive halos are responsible 
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Fig. 14. Qeq(k) as a function of scale, (a) measured in numerical simulations and 
(b) compared to halo model predictions. In (a), the triangles are measurements 
in a box of size WOh- 1 Mpc while squares denote measurements in box sizes of 
300/i _1 Mpc. The linear perturbation theory (PT) and hyperextended perturbation 
theory (HEPT) values are show as solid lines. In (b), the halo model predictions 
associated with Press-Schechter and Sheth-Tormen mass functions generally bracket 
the measurements. The dashed lines show the result of only including contributions 
from halos less massive than 1O 14 /i _1 M0. They lie significantly below the solid 
curves, illustrating that massive halos provide the dominant contributions to these 
statistics. The figure is taken from [236]. 



for a significant fraction of the total non-Gaussianity in the non-linear density 
field. This is shown in the bottom panel of Figure 14; when halos more massive 
than 10 14 M Q /h are absent, Q eq is reduced substantially (compare dashed and 
solid curves). 

The halo based calculation suggests Q eq increases, whereas HEPT suggests 
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that Q eq should remain approximately constant, on the smallest scales. These 
small scales are just beyond the reach of numerical simulations to date. As we 
discuss later, the scales where the two predictions differ significantly are not 
easily probed with observations either, at least at the present time. 

For the trispectrum, and especially the contribution of trispectrum to the 
power spectrum covariance as we will soon discuss, we are mainly interested 
in terms of the form T(ki, — ki, k 2 , — k 2 ), i.e. parallelograms which are defined 
by either the length k 12 or by the angle between k x and k 2 . To illustrate, our 
results, we will take k\ = k 2 and the angle to be 90° (k 2 = kjj so that the 
parallelogram is a square. It is then convenient to define 



This quantity scales roughly as A 2 (A;). This spectrum is shown in figure 12(b) 
with the individual contributions from the lh, 2h, 3h, 4h terms shown explic- 
itly. At k > lO&noniin ~ lO/iMpc^ 1 , Q sq is due mainly from the single halo 
term. 

As for Qeq, the halo model predicts that Q sq will increase at high k. Numerical 
simulations do not quite have enough resolution to test this [236]. 

Figures 13(a) and (b) show that as one considers higher order statistics, the 
halo model predicts a substantial excess in power at linear scales compared to 
the perturbation theory value. This is another manifestation of the problem, 
noted in § 4.4, that, in positive perturbation models, the single-halo contribu- 
tion to the power does not vanish as k — > 0. While this discrepancy appears 
large in the Q sq statistic, it does not affect the calculations related to the 
covariance of large scale structure power spectrum measurements since, on 
linear scales, the Gaussian contribution usually dominates the non-Gaussian 
contribution. However, we caution that dividing the halo model calculation on 
linear scales by the linear power spectrum to obtain, say halo bias or galaxy 
bias, may lead to errors. 



5.3 Power Spectrum Covariance 



The trispectrum is related to the variance of the estimator of the binned power 
spectrum [235,185,76,56]: 



A s 2 q (A;) 



2tt 2 



■T^k, -k,k ± ,-k ± ). 



(123) 




(124) 
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Table 2 



Dark Matter Power Spectrum Correlations 



k 


0.031 


0.058 


0.093 


0.110 


0.138 


0.169 


0.206 


0.254 


0.313 


0.385 


0.031 


1.000 


0.041 


0.086 


0.113 


0.149 


0.172 


0.186 


0.186 


0.172 


0.155 


0.058 


(0.023) 


1.000 


0.118 


0.183 


0.255 


0.302 


0.334 


0.341 


0.328 


0.305 


0.093 


(0.042) 


(0.027) 


1.000 


0.160 


0.295 


0.404 


0.466 


0.485 


0.475 


0.453 


0.110 


(0.154) 


(0.086) 


(0.028) 


1.000 


0.277 


0.433 


0.541 


0.576 


0.570 


0.549 


0.138 


(0.176) 


(0.149) 


(0.085) 


(0.205) 


1.000 


0.434 


0.580 


0.693 


0.698 


0.680 


0.169 


(0.188) 


(0.138) 


(0.177) 


(0.251) 


(0.281) 


1.000 


0.592 


0.737 


0.778 


0.766 


0.206 


(0.224) 


(0.177) 


(0.193) 


(0.314) 


(0.396) 


(0.484) 


1.000 


0.748 


0.839 


0.848 


0.254 


(0.264) 


(0.206) 


(0.261) 


(0.355) 


(0.488) 


(0.606) 


(0.654) 


1.000 


0.858 


0.896 


0.313 


(0.265) 


(0.202) 


(0.259) 


(0.397) 


(0.506) 


(0.618) 


(0.720) 


(0.816) 


1.000 


0.914 


0.385 


(0.270) 


(0.205) 


(0.262) 


(0.374) 


(0.508) 


(0.633) 


(0.733) 


(0.835) 


(0.902) 


1.000 


/ Ca 


1.00 


1.02 


1.04 


1.07 


1.14 


1.23 


1.38 


1.61 


1.90 


2.26 



NOTES. — Diagonal normalized covariance matrix of the binned dark matter den- 
sity field power spectrum with k in units of h Mpc -1 . Upper triangle displays the 
covariance found under the halo model. Lower triangle (parenthetical numbers) dis- 
plays the covariance found in numerical simulations by [185]. Final line shows the 
fractional increase in the errors (root diagonal covariance) due to non-Gaussianity 
as calculated using the halo model. 

where the integral is over a shell in /c-space centered around ki, V S i ~ Ankfdk 
is the volume of the shell and V is the volume of the survey. Recalling that 
5(0) — > V/(2ir) 3 for a finite volume, 

Ca = (PiPj) ~ (A) (Pj) = \ 



(2^ 



2 Pj 5ij + Tij 



(125) 



where 

d kj f d k 



T y = / — ^ / ^T(K -k*, k,, -k,-) . (126) 



^si J. Vsj 
S3 



Although both terms scale in the same way with the volume of the survey, 
only the (first) Gaussian piece necessarily decreases with the volume of the 
shell. For the Gaussian piece, the sampling error reduces to a simple root-N 
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Fig. 15. The correlations in dark matter power spectrum between bands centered 
at k (see Table 1) and those centered at k = 0.031h Mpc -1 a (lower triangles) and 
k = 0.169h Mpc -1 ((upper squares). The open and filled symbols in these cases are 
for 200 h Mpc box simulations with 128 3 and 256 3 particles, respectively. The 
solid lines with filled circles represent the halo model predictions for same bands 
and are consistent with numerical simulations at the level of 10% or better. The 
figure is reproduced from [185]. 

mode counting of independent modes in a shell. The trispectrum quantifies 
the non-independence of the modes both within a shell and between shells. 
Therefore, calculating the covariance matrix of the power spectrum estimates 
reduces to averaging the elements of the trispectrum across configurations in 
the shell. For this reason, we now turn to the halo model description of the 
trispectrum. 

To test the accuracy of the halo trispectrum, we compare dark matter cor- 
relations predicted by our method to those from numerical simulations by 
[185] (see also, [235]). Specifically, we calculate the covariance matrix Cy from 
equation (126) with the bins centered at and volume V si = inkfSki cor- 
responding to their scheme. We also employ the parameters of their ACDM 
cosmology and assume that the parameters that defined the halo concentration 
properties from our fiducial ACDM model holds for this cosmological model 
also. The physical differences between the two cosmological model are minor, 
though normalization differences can lead to large changes in the correlation 
coefficients. 

Table 2 compares the halo model predictions for the correlation coefficients 




(127) 
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Fig. 16. Real-space moments with p = 3 (skewness), 4 (kurtosis) and 5 as a func- 
tion of smoothing scale. Squares and triangles show measurements in high and low 
resolution simulations, illustrating how difficult it is to make the measurement. 
Solid lines show the predictions based on the NFW profile but with Press-Schechter 
(lower) and Sheth-Tormen (upper) mass functions. Dashed line shows the HEPT 
prediction. The figure is from [236]. 

with those measured in the simulations. Agreement in the off-diagonal ele- 
ments is typically better than ±0.1, even in the region where non-Gaussian 
effects dominate, and the qualitative features such as the increase in correla- 
tions across the non-linear scale are reproduced. The correlation coefficients 
for two bands in the linear (0.031/i Mpc -1 ) and non-linear (0.169/i Mpc -1 ) 
regimes are shown in Figure 15. Triangles and squares show the values mea- 
sured in the simulations, and filled circles and solid lines show the halo model 
predictions. The halo model is in agreement with numerical measurements over 
a wide range of scales, suggesting that it provides a reasonable way of estimat- 
ing the covariance matrix associated with the dark matter power spectrum. In 
contrast, perturbation theory can only be used to describe the covariance and 
correlations in the linear regime while in the non-linear regime, and although 
the HEPT provides a reasonable description when hi ~ kj, it results in large 
discrepancies when ki ^> kj [185,235,104]. 

A further test of the accuracy of the halo approach is to consider higher order 
real-space moments such as the skewness and kurtosis. Figure 16 compares 
measurements of higher order moments in numerical simulations with halo 
model predictions: the halo model is in good agreement with the simulations. 
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Fig. 17. Ratio of the single halo term contribution to the total power when the dis- 
tribution of concentrations at fixed mass is lognormal with width cri nc , to that when 
fine — ¥ for the power spectrum (a) and trispectrum (b). The small scale behavior, 
particularly of the higher order statistics, is sensitive to the high concentration tails 
of the p{c\m) distribution. 

5.4 Can we trust the halo model? 



The halo model provides a physically motivated means of estimating the two- 
point and higher order statistics of the dark matter density field. However, 
it has several limitations which should not be forgotten when interpreting 
results. As currently formulated, the approach assumes all halos share a pa- 
rameterized smooth spherically-symmetric profile which depends only on halo 
mass. However, we know that halos of the same mass have a distribution of 
concentration parameters, so that there is some variation in halo profile shape, 
even at fixed mass. In addition, halos in simulations are rarely smooth, and 
they are often not spherically symmetric. 
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Fig. 18. Dark matter power spectrum (top) and reduced bispectrum for equilateral 
configurations (bottom) in numerical simulations (solid lines). Dashed lines show 
result of replacing halos with smooth M99 profiles and remaking the measurements. 
The replacement agrees with the original measurements up to the resolution limit 
of the simulation: k ~ Wh Mpc -1 . Dotted curves are the linear and nonlinear 
expectation, based on fitting functions, in the case of the power spectrum and the 
linear perturbation theory result of 4/7 in the case of Q e q- The figure is from [172]. 

It is straightforward to incorporate the distribution of halo concentrations into 
the formalism [56,237]. In essence, a distribution p(c\m) leads to changes in 
power at non-linear scales k > lh Mpc -1 . This is shown in Figure 17(a): 
the power on large linear scales is unaffected by a distribution p(c\m), but 
the large k power increases as the width of p(c\m) increases. Increasing o"i n c 
increases the power at small scales, because of the increased probabilty of 
occurence of high concentrations from the tail of the distribution. (Recall that 
simulations suggest a hl c rs 0.25.) Higher order statistics depend even more 
strongly on <j\ n c , because they weight the large c tails heavily. To illustrate, 
Figure 17(b) shows how the trispectrum, with ki = k 2 = k 3 = k 4 , changes as 
o"i n c increases. 

Substructure is expected to contribute about 15% of the total dark matter 
mass of a halo (e.g., [280,92]), and it will affect the power spectrum and 
higher order correlations on small scales. Measurements of P(k) and B(k) for 
equilateral triangles in which the actual clumpy nonspherical halo profiles in 
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numerical simulations were replaced by smooth NFW or M99 halo profiles 
suggest that for k < 10k non n n or so, substructure and asphericities are not 
important (see Figure 18). A detailed discussion of how to account for this 
substructure is in [251]. 

No models to date account for departures from spherical symmetry, but this is 
mainly because until recently [139], there was no convenient parametrization 
of profile shapes which were not spherically symmetric. There is no conceptual 
reason which prevents one from including ellipsoidal halos in the model. Until 
this is done, note that spherically averaged profiles are adequate for modelling 
the power spectrum and other statistics which average over configurations, 
such as the S n parameters. The bispectrum is the lowest order statistic which 
is sensitive to the detailed shape of the halos. The dependence of bispectrum 
configuration on the spherical assumption was shown in some detail by [236]; 
they found that the spherical assumption may be the cause of discrepancies 
at the ~ 20 — 30% level between the halo model predictions and configura- 
tion dependence of the bispectrum in the mildly non-linear regime measured 
in simulations. Uncertainties in the theoretical mass function also produce 
variations at the 20% to 30% level (see, [135]). 

Improvements to the halo model that one should consider include: 

(1) Introduction of the asphericity of dark matter halos through a randomly 
inclined distribution of prolate and oblated ellipsoids. Recent work has shown 
that simply modifying the spherically symmetric profile shape to have different 
scale lengths along the three principal axes provides a reasonable parametriza- 
tion of the ellipsoidal profiles of halos in numerical simulations, with the dis- 
tribution of axis ratios depending on halo mass [139]. This makes it relatively 
straightforward to include asphericity in the model. Since the same ellipsoidal 
collapse model [257] which predicts the correct shape for the halo mass func- 
tion (equation 59), can also be used to predict the distribution of halo axis 
ratios, it would be interesting to see if this distribution matches that in sim- 
ulations. At the present time, shape information from X-ray observations of 
galaxy clusters is limited [49], although [?] argue that departures from spher- 
ical symmetry are necessary to correctly interpret their data. 

(2) Incorporation of the effects of halo substructure. See [251] for a first step in 
this direction, which incorporates simple models of what is seen in numerical 
simulations [92,28]. 

(3) Solution of the integral constraint problem at large scales discussed in § 4.4. 
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Fig. 19. The angular two-point correlation function of galaxies in the SDSS early 
release data, for a number of bins in apparent r* band magnitude. In all cases, the 
correlation function is quite well described by a power law: w(9) oc 6~ ' 7 . The figure 
is from [46]. 

6 Prom Dark Matter to Galaxies 



We have known since the late 1960 's that the angular correlation function 
of optically selected galaxies is a power law: w(9) oc ^M 7-1 ), with 7 »s 1.8 
[279]. Figure 19 shows a recent measurement of w(8) from the SDSS collab- 
oration [46]: it is also well described by this power law. This suggests that 
the three-dimensional correlation functions and power-spectra should also be 
power laws. The symbols in Figure 28 show that the power-spectrum of galax- 
ies in the PSCz survey as measured by [105] is accurately described by a 
power-law over a range of scales which spans about three orders of magnitude. 
More recently, the 2dFGRS [203] and SDSS [298] data show that, although 
more luminous galaxies cluster more strongly, for a wide range of luminosi- 
ties, the three-dimensional correlation function is indeed close to a power law. 
Figure 21, from [298], shows that although the slope of the power-law is ap- 
proximately independent of luminosity (left), it is a strong function of galaxy 
color; on small scales, redder galaxies have steeper correlation functions. 

In contrast, a generic prediction of CDM models is that, at the present time, 
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Fig. 20. The PSCz galaxy power spectrum (symbols, from [105]) compared to the 
dark matter power spectrum in a ACDM model (solid curve). We have fixed the 
amplitude of the dark matter power spectrum so that it matches the data on large 
scales. The discrepancy on smaller nonlinear scales suggests that the bias between 
the galaxies and the dark matter must be scale dependent. 

the two-point correlation function of the dark matter, and its Fourier trans- 
form, the dark matter power spectrum, are not power laws (see, e.g., the solid 
curve in Figure 28). Why is the clustering of galaxies so different from that of 
the dark matter? 



6. 1 The clustering of galaxies 

In the approach outlined by White & Rees [292], baryonic gas can only cool 
and form stars if it is in potential wells such as those formed by virialized dark 
matter halos. As a result, all galaxies are expected to be embedded in dark 
halos (see figure 2). More massive halos may contain many galaxies, in which 
case it is natural to associate the positions of galaxies with subclumps within 
the massive halo; some, typically low mass, halos may contain no galaxies; 
but there are no galaxies which are without halos. Within this framework, the 
properties of the galaxy population are determined by how the gas cooling rate, 
the star formation rate, and the effects of stellar evolution on the reservoir 
of cooled gas, depend on the mass and angular momentum of the parent 
halo. There are now a number of different prescriptions for modeling these 
'gastrophysicaP effects [293,157,42]. 
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Fig. 21. Projected two-point correlation function of galaxies with absolute mag- 
nitude and redshift ranges indicated (left) and for different bins in color (right). 
In panel on left, squares, circles and triangles show results for faint, intermediate 
and luminous galaxies respectively. Although the more luminous galaxies are more 
strongly clustered, the same power-law slope provides a reasonable fit at all lumi- 
nosities. In constrast, the slope of the power-law is a strong function of color. Both 
panels are from [298]. 

Within the context of the halo model, the gastrophysics determines how many 
galaxies form within a halo, and how these galaxies are distributed around 
the halo center. Thus, the halo model provides a simple framework for think- 
ing about and modeling why galaxies cluster differently than dark matter 
[137,244,212,236,253,240,11]. 

Suppose we assume that the number of dark matter particles in a halo follows 
a Poisson distribution, with mean proportional to the halo mass such that 
(Nd m \m) qc m, and (Nd m (Ndm — l)\m) oc m 2 . Note that these proportionalities 
are the origin of the weighting by m and m? in equation (88) for P^(k) and 
Pdm(k). To model the power spectrum of galaxies, we, therefore, simply modify 
equation (88) to read 



^Bi(fc) = P&{k) + P£(k) , where 

P*{k) = I dmn(m) ~ !)N \ UgM m)\P 

J n gal 



dmn(m) bi(m) - — — - u ga \(k\m) 



'1281 



0*) wpBn (*) 

Here, 

^ g ai = / dmn(m) (N ga i\m) (129) 
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denotes the mean number density of galaxies. On large scales where the two- 
halo term dominates and u ga i(k\m) — > 1, the galaxy power spectrum simplifies 
to 

PsM^b^P^k), (130) 

where 

&gai = / dmn(m) 6i(m) (131) 



denotes the mean bias factor of the galaxy population. 

In addition to replacing the weighting by mass (i.e., the number of dark mat- 
ter particles) with a weighting by number of galaxies, there are two changes 
with respect to equation (88). First, u ga i(k\m) denotes the Fourier transform 
of the density run of galaxies rather than dark matter around the halo center. 
Although a natural choice is to approximate this integral by using the sub- 
clump distribution within a halo, we will show shortly that setting it to be the 
same as that of the dark matter (equation 80) is a reasonable approximation 
[258]. Second, in the single-halo term, the simplest model is to set p = 2 for 
Pdm(k). However, in halos which contain only a single galaxy, it is natural to 
assume that the galaxy sits at the center of its halo. To model this, one would 
set p = 2 when (N ga \(N ga i — 1)) is greater than unity and p — 1 otherwise. 

It is worth considering a little more carefully where these scalings in the one- 
halo term come from. Suppose that in a halo which contains iVg a i galaxies, one 
galaxy sits at the halo centre. Each of the galaxies contributes a factor of u ga \ 
to the power, except for the central galaxy which contributes a factor of unity. 
Pairs which come from the same halo are of two types: those which include 
the central galaxy, and those which do not. Since only the galaxies which are 
not at the centre get factors of u ga \, the weighting must be proportional to 

]T p(N gal \m) 

JVgal>l 



(N gal - 1) u gal (k\m) + (iVgal 2) u gal (k\mf ; 



where p(N ga \\m) is the probability an m— halo contains iV ga i galaxies, and the 
sum is from N gal > 1 because, to contribute pairs, there must be at least 
two galaxies in the halo. The first term is the contribution from pairs which 
include the central galaxy, and the second term is the contribution from the 
other pairs. The sums over iVg al yield 

(7V gal -l|m)+p(0|m) 



u ga i(k\m) - u ga i(k\m) 2 
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+ (iV ga i(iV gal - l)/2|m) u gal (k\m) 2 . 



Evidently, to compute this term requires knowledge of p(0\m). However, if 
we are in the limit where most halos contain no galaxies, then the leading 
order contribution to the sum above is p(2\m) u ga \(k\m). But, in this limit, 
(iV g ai(iV ga i - l)|m) = EN gSLl (N gal - l)p(N gai \m) « 2p(2|m), so this leading 
order term should be well approximated by (N ga i(N ga i — l)/2|m) u ga \(k\m). 
In the opposite limit of a large number of galaxies per halo, it should be 
accurate to set p(0\m) <C 1. Then the expression above reduces to (iV ga i — 
l\m) [u ga \(k\m)— u ga/ \(k\m) 2 } + {N gsl (N ga i — 1)/2) u ga \(k\m) 2 . For Poisson counts, 
(n(n — 1)) = (n) 2 . If this is indicative of other count models also, then this 
shows that the dominant term is the one which comes from the second factorial 
moment. Therefore, it should be reasonable to approximate the exact expres- 
sion above by (N ga i(N ga \ — l)\m) u ga \{k\m) when {N ga \(N ga \ — l)\m) < 1, and 
by (N gal (N gal — l)\m) u ga \(k\m) 2 otherwise. Notice that the two limits differ 
only by one factor of u ga \. 

The expressions above show explicitly that if (N ga i\m) and (N ga \(N ga i — l)\m) 
are not proportional to m and m 2 respectively, then the clustering of galaxies 
will be different from that of the dark matter, even if u ga \{k\m) = Udm(k\m). 
Because the one- and two-halo terms are modified (with respect to the dark 
matter case) by two different functions, it may be possible to adjust them 
separately in such a way that they sum to give the observed power law. 

Thus, the halo model shows that the distribution p{N ga \\m) determines whether 
or not P ga ,\(k) is a power law. Although the analysis above assumed that 
p(N ga \\m) depends only on m, it is very likely that other properties of a halo, 
than simply its mass, determine the number of galaxies in it. For example, 
iVg a i almost certainly depends on the halo's formation history. Since the con- 
centration c of the halo density profile also depends on the formation history 
[198,78,288], a convenient way to incorporate the effects of the formation his- 
tory is to set p(N ga \\m, c), and then integrate over the lognormal scatter in halo 
concentrations when computing the halo model predictions. In what follows, 
we will ignore this subtlety. 

Although the exact shape of p(N ga i\m) is determined by gastrophysics, there 
are some generic properties of it which are worth describing. Since galaxies 
form from baryons, a simple first approximation would be to assume that 
the first moment, (iV gal |m), should be proportional to the mass in baryons, 
which, in turn, is likely to be a fixed fraction of the mass in dark matter of the 
parent halo. If we assume that {N ga i\m) oc m a (so fi = 1 is the scaling of the 
dark matter), then there are two reasons why we might expect a < 1. Firstly, 
for the very massive halos, it is natural to associate galaxies with subclumps 
within the halo. The total number of subclumps within a massive parent halo 
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Fig. 22. The total luminosity in galaxies brighter than M r * < — 18 which are in 
a halo, as a function of the total mass of the halo, from the semi-analytic galaxy 
formation models of [157]. Dashed lines show lines of constant mass-to-light ratio: 
the value of M/Li$ at z = 0.5 shown is a factor of two smaller than at z = 0. 

which are more massive than a typical galaxy scales as a ~ 0.9 [92]. Halo 
substructure as a plausible model for the galaxy distribution is discussed by 
[41,158]. If one identifies all subclumps in CDM haloes which had velocity 
dispersions larger than about 100 km/s (which is typical for a small galaxy 
sized halo), then the correlation function of these objects is a power law of 
about the same slope and amplitude as that of optically selected galaxies. 
Remarkably, the slope and amplitude of this power law are approximately the 
same whether one identifies the subclumps at redshifts as high as 3 or as low 
as (see [3] for a clear discussion of why this happens, and Figure 24 below). 

Secondly, galaxy formation depends on the ability of baryons to cool. Since 
the velocity dispersion within a halo increases with halo mass, the efficiency 
of cooling decreases. This might lead to a reduction in the efficiency of galaxy 
formation at the high mass end relative to the low mass end. Such a mass 
dependent efficiency for galaxy formation has been used to explain the ob- 
served excess of entropy in galaxy clusters relative to smaller groups [25]. At 
the low mass end, one might imagine that there is a minimum dark halo mass 
within which galaxies can be found. This is because the energy feedback from 
supernovae which explode following an initial burst of star formation may be 
sufficient to expel the baryons from the shallower potential wells of low mass 
halos. Also, during the epoch of reionization at z < 6, photoionization may 
increase the gas temperature. The temperature of the reheated gas may exceed 
the virial temperature of low mass halos, thus suppressing star formation in 
them [156,30,10]. 

Detailed semi-analytic galaxy formation models allow one to quantify these 
effects [157,9]. The symbols in Figure 23 show how (JV ga i|m) depends on galaxy 
type and luminosity in the models of [157]. The lines show simple fits (from 
[253]): 
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Fig. 23. The average number of galaxies as a function of dark matter halo mass in 
the semi-analytic galaxy formation models of [157]. Curves show the fits in equa- 
tion (132). 

(N Bluc \m) =0.7 if 10 11 MqF 1 < m < M Blnc 

= 0.7 (m/M Bluc ) aB if m > M Bluc 
(N Rcd \m) = (m/M Rcd ) a * m > 10 11 M©^ 1 

(N gal \m) = (N Blue \m) + (N Red \m), (132) 



where M Blue = 4 x 10 12 M Q /h, a B = 0.8, M Red = 2.5 x 10 12 M Q /h, and 
a R = 0.9. 

Figure 24 compares the distribution of subclumps in the numerical simula- 
tions of [291] with the expected number counts of galaxies within halos (equa- 
tion (132). The number of semianalytic galaxies per halo scales similarly to 
the dark matter halo subclumps when the mass limit of subclumps are above 
10 n /i _1 M , suggesting that identifying halo subclumps with galaxies is a 
reasonable model. 



Another interesting feature of these models in shown in Figure 25. The top 
and bottom panels show (N gal \m) and {N gai (N gai — l)\m) from the GIF models, 
but now we only show counts for galaxies which have absolute magnitudes in 
the range —19 < M r * < —20. The top panels show that there is a pronounced 
peak in the number of galaxies per halo when (N ga i\m) < 1; in this regime, 
there is a relatively tight correlation between the luminosity of a galaxy and 
the mass of its parent halo. In the more massive halos which contain many 
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Fig. 24. The number of subclumps in a halo as a function of parent halo mass in 
a simulation at z = 1 (left) and 3 (right). Top panel shows (N) (long dashed) and 
y/{N(N - 1)) (short dashed) function of mass for: all subclumps (upper lines) 
and for subclumps with mass greater than 10 10 (middle) and W 11 h -1 M & (lower), 
respectively. The lower solid line shows equation (132). Middle panel is similar, but 
with cuts on stellar mass: all subclumps (upper lines) and subclumps with stellar 
mass greater than 10 9 (middle) and 10 10 h^ 1 Mq (lower). Bottom panel shows cuts 
on star-formation rate: all subclumps (upper lines), and for subclumps with star 
formation rates greater than 1 (middle) and 10 (lower) M Q /yr. 

galaxies, there is no correlation between luminosity and halo mass, and the 
number of galaxies scales approximately linearly with halo mass. Figure 23 is 
built up from a number of curves like those shown here. 

The bottom panels in Figure 25 are also interesting. If p(N ga \\m) were Poisson, 
then {N gai (N gai — l)\m) = {N gal \m) 2 . While the Poisson model is reasonably 
accurate at large (N gal \m), the scatter in iV gal at fixed m can be substantially 
less than Poisson at the low mass end. This is largely a consequence of mass 
conservation [252]: the Poisson model allows an arbitrarily large number of 
galaxies to be formed from a limited amount of dark matter. For this rea- 
son, [236] argued that a binomial distribution should provide a convenient 
approximation to p{N ga y\m). A binomial is specified by its mean and its sec- 
ond moment. To match the semianalytic models, the mean must be given by 
equation (132), and the second moment by 

(iV ga i(iV ga i - 1)) 1/2 = a(m) (N gal \m) , (133) 



where a(m) = log J m/lO^h^MQ for m < 10 13 /i -1 Mq and a{m) = 1 there- 
after. The Binomial assumption allows one to model higher order correlations, 
since, by analogy with the two point correlation function, the halo model 
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Fig. 25. The mean, (TV), and second factorial moment, (N(N— 1)) of the distribution 
of the number of galaxies per halo as a function of the halo mass m. Symbols 
show measurements in the semianalytic model of [157] and we have selected objects 
which are predicted to have absolute magnitudes between —19 and —20 in the SDSS 
r*— band. Results for absolute magnitudes in the range —17 to —18, and —18 to 
— 19 are qualitatively similar, although the peak for the lower luminosity bins shifts 
to lower masses. 

for £ n depends on the n— th moment of p(N gal \m). For example, the bi- and 
trispectra require knowledge of the third and fourth moments of p(N ga \\m). 

Figure 26 shows the result of inserting the N ga \—m relations shown in Figure 23 
(equation (132) in the halo model, and changing nothing else (i.e., the red and 
blue galaxies are both assumed to follow the same NFW profile as the dark 
matter). The symbols show measurements in the GIF semianalytic models 
which equation (132) describes, and the curves, which provide a good fit, 
show the halo model prediction. On small scales, the redder galaxies have a 
steeper correlation function than the blue galaxies, in qualitative agreement 
with the SDSS measurements shown in Figure 21. The agreement between the 
simulations and the halo model calculation suggests that almost the entire 
difference between the clustering of red and blue galaxies is a consequence of 
the iVg a i — m relation. The smaller additional effect which comes from allowing 
the red and blue galaxies to be distributed differently around the parent halo 
centre (e.g., if the reds are more centrally concentrated), is studied in some 
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Fig. 26. Correlation functions of different tracers of the dark matter density field 
in the GIF ACDM semianalytic galaxy formation model. Filled circles are for the 
dark matter, crosses are for red galaxies,squares for galaxies which have low star 
formation rates, triangles for galaxies with high star formation rates, and open 
circles for blue galaxies. The two solid curves show the halo model predictions for 
the red and blue galaxies, and the dashed curves show what happens if we use 
the second factorial moment of the galaxy counts, rather than the second moment 
when making the model predicition. For comparison, the dotted curve shows the 
predicted dark matter correlation function. Bottom panel shows how the bias factor: 
V£,( r )/£,dm{ r ) depends on scale. The figure is from [260]. 

detail in [240]. 

The dependence of clustering on luminosity (Figure 21) is also straightforward 
to understand. If luminous galaxies reside in the more massive halos (this is a 
natural prediction of most semianalytic models), then, because the more mas- 
sive halos are more strongly clustered (Figure 4), the more luminous galaxies 
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should be more strongly clustered. The halo model also shows clearly that, in 
magnitude limited surveys such as the SDSS, this sort of luminosity dependent 
clustering must be taken into account when interpreting how the angular clus- 
tering strength depends on the magnitude limit, and when inverting w{6) to 
estimate P(k). If not, then the fact that the more strongly clustered luminous 
galaxies which are seen out to larger distances, and hence contribute to the 
largest scale power, will lead to erroneous conclusions about the true amount 
of large scale power. 

Whereas most implementations of the halo model have concentrated on the 
p(N ga \\m) relation derived from semianalytic galaxy formation models (e.g., 
Figure 23), information about p(N ga i\m) is encoded in the luminosity functions 
of galaxies and clusters. For example, [212] argue that observations of the mass- 
to-light ratio in groups (e.g., plots like Figure 22, but made using data rather 
than semianalytic models), the combined luminosity function of galaxies in 
groups and clusters, and the galaxy luminosity function itself, can together be 
used to determine the mean number of galaxies per halo mass. The idea is to 
use the galaxy luminosity function to estimate a characteristic luminosity; use 
it to estimate the number of galaxies in a group by matching to the luminosity 
function of galaxies in groups and clusters; assign a mass to galaxy groups and 
clusters by requiring the observed number density of groups from the halo mass 
function agree with that obtained from the luminosity function. This leads to 
a measured mean number of galaxies of the form (N ga i\m) oc m° m which is 
close to that shown in, e.g., Figure 23. The SDSS galaxy cluster catalogs offer 
a promising opportunity to exploit this approach. 

It is remarkable that this simple 7V ga i — m parametrization of the semiana- 
lytic models is all that is required to understand how and why the clustering 
depends on galaxy type. It is this fact which has revived interest in the halo 
model. 



6.2 Galaxy-dark matter cross power spectrum 



The halo model also suggests a simple parameterization of the cross-correlation 
between the galaxy and dark matter distributions [244,99]: 



"gal-dm(fc) — -Pgal-dm(^) + ^gal-din(^) 



where 



gal— dm 



(*) = / 



dm 



mn{m) (N ga \\m) 
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/ 



dmn(m) bi(m) - _ gal ^ -u ga i(A:|m) 



(134) 



and, as before, one sets p = 1 if (N ga \) < 1 and one is interested in requir- 
ing that one galaxy always sits at the halo center. This expression is easily 
generalized to the cross-correlation between two galaxy samples. 

If one galaxy always sits at the halo centre, then these expressions must be 
modified. To see the effect of this on the two-halo term, we must average both 
pieces of the two halo term over p(n\m), with the requirement that n > 0. 
This requires evaluation of sums of the form 

y^[l + (n — l)u(k\m)] p{n\m) = 1 — p(0\m) + (n — l\m) u{k\m) + u{k\m) p(0\m) 

n>0 



which we could also have written as 

N cS (k\m) = [1 -p(0\m)] [1 - u g£d (k\m)} + (n\m) u ga y(k\m) . 

Since both factors in the first term are positive, this shows clearly that there 
is an enhancement in power which comes from always placing one galaxy at 
the halo centre. Since u(k\m) decreases as k increases, the enhancement in 
power is largest on small scales (large k). In sufficiently massive halos one 
might expect to have many galaxies, and so p(0\m) < 1. In this limit, the 
expression above becomes 1 — u{k\m) + (n\m) u{k\m) — 1 + (n — l\m) u(k\m). 
On the other hand, if most halos have no galaxies, then p(l\m) is probably 
much larger than all other p(n\m) with n > 2. Then the leading order term 
in the sum above is p(l\m). Since (n\m) = J2 n p{n\ m ) ~ P(l| m )> we have 
that N e ff(k\m) « (n\m). In this limit, only a fraction (n\m) <C 1 of the halos 
contain a galaxy, and the galaxy sits at the halo centre, so there is no factor 
of u. 

The contribution of the galaxy counts to the one halo term of the galaxy-mass 
correlation function is similar. Using the expressions above yields 



/TTt 
dmn(m) — \u(k\m) 



N eS {k\m) 



/in 
dm n(m) — b(m) u(k\m 
P 

[a f \m ^ N ^{k\m) 
I amnym) o(m) 



(135) 



If the run of galaxies around the halo centre is not the same as of the dark 
matter, then one simply uses u gSb i instead of u in iV eff . If the two-halo term usu- 
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ally does not dominate the power on small scales (this is almost always a good 
approximation), it is reasonable to ignore the enhancement in power associ- 
ated with the central galaxy, and to simply set iV eff (A;|m) ps (n\m) u ga i(k\m) m 
(n\m). The one-halo term requires knowledge of p(0\m). Since p(0\m) is usually 
unknown, the approximation above interpolates between the two limits dis- 
cussed earlier by setting N c s = (n\m)u(k\m) if (n\m) > 1, and N e a = (n\m) 
if (n\m) < 1. 

In what follows, it will be convenient to define the cross-correlation coefficient: 

rW = v^2L. (i3 6 ) 

Jp dm (k)p gal (k) 

Note that r(k) may depend on scale k. 

Because we cannot measure the clustering of dark matter directly, the galaxy- 
dark matter cross-correlation is not observable. However, if one cross-correlates 
the galaxy distribution with weak lensing shear measurements, then the re- 
sulting signal is sensitive to this cross-correlation [244,99]. We discuss this 
further in § 8.7. 

6.3 Discussion 

Figure 20 showed the galaxy power spectrum from the PCSZ survey [105]. The 
nonlinear dark matter power spectrum, scaled with a constant (k— independent 
bias factor) to match the linear regime cannot also match the power on small 
scales: this shows that the bias between dark matter and galaxies must depend 
on scale. 

The top panel in Figure 27 shows the contributions to the dark matter power 
spectrum as a function of halo mass. The halo model description of the galaxy 
power spectrum shows clearly that the p(7V ga i|m) distribution changes the 
relative contributions of low and high-mass halos to the total power, and so 
modifies the shape of the power spectrum in a way which depends on k. 

The main change to the amplitude of the small-scale contribution to the galaxy 
power spectrum, the change which results in a power-law shape, comes from 
the halos which contain at least one galaxy, or, effectively halos containing 
what are called field or isolated galaxies. The assumption that these galaxies 
are at the center of the halo they occupy results in a power-law at small scales 
[212]. As discussed in [236], the contribution to the total power from such halos 
is very sensitive to the low-mass cutoff in the galaxy-mass relation. Thus, the 
small scale clustering of galaxies essentially allows one to constrain certain 
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Fig. 27. (a) Contributions by halo mass to the one-halo term in the halo model 
description of the power spectrum, (b) The same, but for the dark matter-galaxy 
cross-correlation power spectrum. At small scales, there is effectively no contribution 
from the smallest mass halos for the galaxy power spectrum. This figure is from 
[244]. 

parameters related to the galaxy formation, such as the minimum mass in 
which a galaxy can exist. 

Also, as shown in figure 27, at intermediate scales, the massive halos contribute 
less to the galaxy-dark matter power spectrum than to the dark matter power 
spectrum. This is because the N ga \ ~ ^0.8-0.9 we ighting suppresses the contri- 
bution from the high mass end of dark matter halos. Figure 28 compares the 
associated galaxy power spectrum with that measured in the PCSZ survey. 
Note the power law behavior of the galaxy power spectrum over three to four 
decades in wavenumber. 



In the halo model, galaxy power spectra and higher order correlations, when 
studied as a function of galaxy type or environment, allow one to extract cer- 
tain galaxy properties such as the mean N ga \ — m relation, and the mean mass 
of dark matter halos in which galaxies reside. This information may be helpful 
for understanding the galaxy formation and evolution processes. In [236] and 
[237], constraints on p{N ga \\m) were obtained by comparing halo model pre- 
dictions with the measured variance and higher order correlations of galaxies 
in the APM [176] and PSCZ surveys. The halo based contraints of galaxy for- 
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Fig. 28. The PSCz galaxy power spectrum (symbols) and the result of tuning the 
first two moments of p(N ga _\\m) so as to produce the power law like behavior (solid 
curve) . 

mation models are likely to increase with ongoing wide-field surveys such as 
the Sloan Digital Sky Survey (SDSS) and the 2dFGRS. The halo approach to 
galaxy clustering has already become helpful for interpreting the SDSS two- 
point galaxy correlation function [69] and the lensing-mass correlation [99] . 



7 Velocities 

One of the great strengths of the halo-based approach is that it provides a 
clear prescription for identifying the scale on which perturbative approaches 
will break down, and non-linear effects dominate. The separation of linear and 
non-linear scales is an important tool when describing large scale velocities and 
related statistics. We now present the halo model description of velocities by 
extending [249,253]. 



7. 1 Velocities of and within halos 

In the model, all dark matter particles are assumed to be in approximately 
spherical, virialized halos. The velocity of a dark matter particle is the sum of 
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two terms, 



V = V vir + Whalo 



(137) 



the first is due to the velocity of the particle about the center of mass of its 
parent halo, and the second is due to the motion of the center of mass of 
the parent. We will assume that each of these terms has a dispersion which 
depends on both halo mass and on the local environment, so that 

a 2 (m, 5) = crl h (m, 5) + ^ alo (m, 5). (138) 



The expression above assumes two things: the rms velocities depend on halo 
mass and local density only, and that the rms virial velocity within a halo 
is independent of the motion of the halo itself. Presumably both assumptions 
break down if the dark matter is collisional and/or dissipative. For collisionless 
matter, the assumption that the virial motions within a halo is independent 
of the halo's environment is probably reasonably accurate. It is not clear that 
the same is true for halo speeds. Indeed, it has been shown that halos in dense 
regions move faster than those in underdense regions [42,253]. It will turn 
out, however, that the fraction of regions in which cr^ alo (m, 5) is significantly 
different from cr^ alo (m, 0) is quite small. This means that neglecting the density 
dependence of halo velocities should be a reasonable approximation. 

Consider the first term, tVir- We will assume that virialized halos are isothermal 
spheres, so that the distribution of velocities within them is Maxwellian. This 
is in reasonable agreement with measurements of virial velocities within halos 
in numerical simulations. If a vir denotes the rms speeds of particles within a 
halo, then the virial theorem requires that 

Gm 2 H(z) 2 V 3 / 3m \ 2/3 

— oc^oc— AlW {—- T (139) 



where the final proportionality comes from the fact that all halos have the same 
density whatever their mass: m/r 3 oc A vir p crit . This shows that a vir oc m 1 / 3 : 
the more massive halos are expected to be 'hotter'. At fixed mass, the constant 
of proportionality depends on time and cosmology, and on the exact shape of 
the density profile of the halo. A convenient fitting formula is provided by [24]: 



where g a = 0.9, and 

A vir = 18vr 2 + 60a; - 32x 2 , with x = Q(z) - 1 (141) 
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and Q(z) = [Q m (1 + z) 3 ] [H /H(z)} 2 . This fitting formula for the average 
density within a virialized object, A vir p crit , generalizes the value 187r 2 given 
previously for an Einstein de-Sitter universe in equation (52). 

In [253], it was shown that this virial relation between mass and velocities is 
independent of the local environment. In practice, however, a V i r may depend 
on position within the halo. Accounting for the fact that halos really have 
more complicated density and velocity profiles is a detail which complicates 
the analysis, but not the logic of the argument. If the virialized halo is an 
isothermal sphere, the density run around the halo center falls as the square 
of the distance, then a vir is the same everywhere within the halo. In practice, 
halos are not quite isothermal, but we will show later that the scaling above 
is still both accurate and useful. 

We now turn to the second term, tWo- It will prove more convenient to first 
study halo speeds after averaging over all environments, before considering 
the speeds as a function of local density. This is similar to the order in which 
we discussed the halo mass function and its dependence on density. We first 
consider a halo of size r at the present time. Because the initial density fluc- 
tuations were small, the particles in this halo must have been drained from 

1 /3 

a larger region R in the initial conditions: R/r m A vir , where A v ; r 200 or 
so. This means, for example, that massive halos were assembled from larger 
regions than less massive halos. Suppose we compute the rms value of the 
initial velocities of all the particles which make up a given halo and extend 
to include all halos of mass m, then we have effectively computed the rms 
velocity in linear theory, smoothed on the scale R{m) oc m 1 / 3 . 

It is well known that the linear theory prediction for the evolution of velocities 
is more accurate than the linear theory prediction for the evolution of the 
density [216]. In what follows, we will assume that at the present time, the 
velocities of halos are reasonably well described by extrapolating the velocities 
of peaks and are smoothed on the relevant scale, R oc m 1//3 , using linear theory. 
For Gaussian initial conditions, this means that any given value of i>h a io is 
drawn from a Maxwellian with dispersion 0h alo (m) given by: 



and W(x) is the Fourier transform of the smoothing window. The factor H Q^ 
comes from a well-known approximation to the derivative of the growth func- 
tion, with dlogG / d\oga ~ f2^ 6 when a is the scale factor. Notice that the 




(142) 



where, 
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Fig. 29. Dependence on halo mass of the non-linear (<7 v ir) and linear theory (<7h a io) 
terms in our model. Solid curves show the scaling we assume, and symbols show 
the corresponding quantities measured in the z = output time of the ACDM GIF 
simulation. Error bars show the 90 percentile ranges in mass and velocity. Dashed 
curve in panel on right shows the expected scaling after accounting for the finite 
size of the simulation box. Symbols and curves in the bottom of the panel on the 
right show the predicted and actual velocities at z = 20. 

predicted rms velocity depends both on cosmology and on the shape of the 
power spectrum. The term under the square-root arises from the peak con- 
straint [6] — it tends to unity as m decreases: the peak constraint becomes 
irrelevant for the less massive, small R, objects. 

In figure 29, we compare the dependence on mass for the two velocity terms 
in numerical simulations by the GIF collaboration [157] and the dependences 
we have discussed above. The symbols with error bars show the median and 
ninety percentile ranges in mass and velocity. Open squares, filled squares, 
open circles and filled circles show halos which have 60 — 100, 100 — 10 3 , 
10 3 — 10 4 and 10 4 — 10 5 particles, respectively. There are two sets of symbols 
in the panel on the right. For the time being, we are only interested in the 
symbols in the upper half which show halo velocities at z — 0. The solid curves 
in the two panels show the scalings we assume. 

Although the scaling of the virial term with mass is quite accurate, it ap- 
pears that the extrapolated linear theory velocities are slightly in excess of 
the measurements in the simulations. This is almost entirely due to the finite 
size of the simulation box. The upper dashed curve shows the effect of using 
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equation (142) to estimate the rms speeds of halos, after setting P(k) = for 
k < 2ti/L, where L is the box-size: L = 141 Mpc/h. Thus, the two panels 
show that our simple estimates of the two contributions to the variance of the 
velocity distribution are reasonably accurate. 

Notice that the two terms scale differently with halo mass; indeed, to a first 
approximation, one might even argue that halo speeds are independent of halo 
mass. Figure 29 shows that <7h a i (m) < ov^m) for massive halos. Since massive 
halos have larger dispersions than less massive halos, the large velocity tail of 
f(v) is determined primarily by the non-linear virial motions within massive 
halos, rather than by the peculiar motions of the halo centers of mass. For 
this reason, the large velocity tail of f(v), at least, is unlikely to be sensitive 
to inaccuracies in our treatment of halo velocities, or to our neglect of the 
possibility that halo speeds may depend on the environment. Before moving 
on, note that our finding that massive halos are hotter, whereas the speeds 
with which halos move is approximately independent of mass, suggests that if 
massive halos occupy denser regions, then, we expect a temperature-density 
relation such that denser regions should be hotter. We will return to this later. 

7.2 The distribution of non-linear velocities 

In an ideal gas, the distribution of particle velocities f(v) dv has the Maxwell- 
Boltzmann form: each cartesian component of the velocity is drawn from an 
independent Gaussian distribution. Because of the action of gravity, the dark 
matter distribution at the present time is certainly not an ideal gas; numeri- 
cal simulations show that f(v) dv is very different from a Maxwell-Boltzmann 
[231]; the distribution of each component of the velocity has an approximately 
Gaussian core with exponential wings. The halo model decomposition of pecu- 
liar velocities into linear and non-linear contributions (equation 137), provides 
a simple explanation for why this is so [249,253]. 

Let p(v\m)dv denote the probability that a particle in a halo of mass m 
has velocity in the range dv about v. Then the total distribution is given by 
summing up the various p{y\m) distributions, weighting by the fraction of 
particles which are in halos of mass m: 



where n(m) dm is the number density of halos that have mass in the range 
dm about m. The weighting by m reflects the fact that the number of dark 
matter particles in a halo is supposed to be proportional to the halo mass. 
This expression holds both for the size of the velocity vector itself, which we 



f(v) 



J dm mn{m) p{v\m) 



r , mn(m) . . , . 

dm ^p(dm), (143) 

J p 



J dm mn(m) 
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will often call the speed, as well as for the individual velocity components. 

To proceed, we need a model for the actual shape of p(v\m). Since v is the 
sum of two random variates (equation 137), we study each in turn. The virial 
motions are assumed to be Maxwellian. Also, for Gaussian initial density fluc- 
tuations, the linear peaks theory model of the halo motions means that they 
too are Maxwellian. Thus, in the model, each of the three cartesian compo- 
nents of the velocity of a dark matter particle in a clump of mass m is given 
by the sum of two Gaussian distributed random variates, one with dispersion 
(7y ir (m)/3 and the other with Ch a io( m )/3- If we further assume that the motion 
around the clump center is independent of the motion of the clump as a whole, 
then these two Gaussian variates are independent and p(v\m) is a Maxwellian 
with a dispersion which is the sum of the individual dispersions given by the 
sum in quadrature of equations (140) and (142). 

In practice, we are only likely to observe velocities along the line of sight. Thus, 
we will eventually be interested in the distribution of f(v) projected along 
the line of sight. Projection changes the Maxwellian p(v\m) distributions into 
Gaussians: 

e~W°W 2 , 2/ x <rlir(™) cr 2 halo (m) ,, AA . 

, where a 2 (m) = mrK ' + haloK - ; (144) 

J2ira 2 (m) 3 3 



p(v\m) = 



i.e., a 2 {m) is one third of the sum in quadrature of equations (140) and (142). 

Now, a 2 ir (m) / a 2 (m*) = (m/m*) 2 / 3 , whereas o\ alo is independent of halo mass 
(Figure 29). Therefore, the characteristic function of f(v) is 



J dv e tvt f(v) = J dm mn{m) j dv e tvt p(v\m) 

= J dm mn(m) ^ 2 <M)I^> ^-LJ^> 



v V 2tt 
exp{-t 2 a 2 halo /6) 
[l + tV„V(m*)/3]V2- 



(145) 



The penultimate expression uses equation (57) for the halo mass function and 
assumes that the initial spectrum of fluctuations was scale free with P(k) oc 
A; -1 , which should be a reasonable approximation to the CDM spectrum on 
cluster scales. The final expression is quite simple: it is the product of the 
Fourier transforms of a Gaussian and a K — Bessel function. Therefore, f(v) is 
the convolution of a Gaussian with a K — Bessel function. The Bessel function 
has exponential wings. Because the dispersion of the Gaussian and the Bessel 
function are similar, equation (145) shows that f(v) should have a Gaussian 
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Fig. 30. The distribution of one-dimensional peculiar velocities for dark matter 
particles in a ACDM cosmology. Histograms show the distribution of the three 
cartesian components measured in GIF simulations. Dashed and dot-dashed curves 
show Gaussian and exponential distributions which have the same dispersion. The 
solid curve shows the distribution predicted by our model, after accounting for the 
finite size of the simulation box. The exponential wings are almost entirely due to 
virial motions within halos. 

core which comes from the linear theory halo motions, with exponential wings 
which come from the non-linear motions within halos. 

Figure 30 shows the one-dimensional f(v) distribution given by inserting equa- 
tions (144) and (59) in equation (143) for the same cosmological model pre- 
sented in Figure 29. The histograms show the distribution measured in GIF 
simulations. For comparison, the dashed and dot-dashed curves in each panel 
show Gaussians and exponential distributions which have the same dispersion. 
The solid curves show the distribution predicted by the halo-based model: note 
the exponential wings, and the small |i>| core that is more Gaussian than expo- 
nential. The exponential wings are almost entirely due to non-linear motions 
within massive halos, so they are fairly insensitive to our assumptions about 
how fast these halos move. 

It is worth emphasizing that a(m) in equation (144) is set by the cosmological 
model and the initial conditions. Thus, the second moment of the distribution 
in Figure 30 is not a free parameter. This halo model for p(v\m) can be thought 
of as a simple way in which contributions to the velocity distribution statistic 
are split up into a part which is due to non-linear effects, given by the first term 
in equation 137, and a part which follows from extrapolating linear theory to 
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a later time, denoted by the second term of that equation. The agreement 
with simulations suggests that this simple treatment of non-linear and linear 
contributions to the statistic are quite accurate. 

Before moving on, note that the second moment of this distribution gives the 
mass-weighted velocity dispersion: 



'vel 



dm- 



mn(m) 



a 2 vir {m) +<r 2 halo (m) 



(146) 



This quantity is a measure of the total kinetic energy in the Universe, and 
hence is directly related to the Layzer- Irvine Cosmic Energy equation [258]. 
Observational estimates of this quantity are discussed by [65]. Because the 
virial velocities within massive halos are substantially larger the motions of 
the halos themselves (Figure 29), setting 



/' n k 
—P(kr)\W(kR(m)\* (147) 

(i.e., ignoring the peak constraint and simply assuming that halo velocities 
trace the linear velocity field smoothed at the scale from which halos collapsed) 
is a reasonable approximation. This expression for o~ ve i will be useful in the 
analyses of the CMB which follow. 



7.3 Pairwise velocities 



It is reasonable to expect that, as a result of their gravitational interaction, 
pairs of particles will, on average, approach each other. The gravitational 
attraction depends on separation, and it must fight the Hubble expansion 
which also depends on separation, so one might expect the mean velocity 
of approach to depend on the separation scale r. In fact, pair conservation 
provides a relation between the rate at which the correlation function on scale 
r evolves, and the mean pairwise motion at that separation. In particular, pair 
conservation requires the mean peculiar velocity between a pair of particles at 
separation r to satisfy [216] 

Mr) 1 0(l+fl 

Hr 3[l + £(r)] din a ' K ' 



where £(x) is the volume averaged correlation function on proper scale x, 
£(x) = 3x~ 3 Jq dyy 2 £(y). Since we have an accurate model for £(r), we can use 
it to estimate Vi 2 (r). Before we do so, it is useful to see what linear theory 
would predict. 
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Fig. 31. The ratio of mean streaming velocity of dark matter particles at scales 
separated by r, to the Hubble expansion at that scale. Triangles show the Virgo 
simulation measurements, circles show the GIF ACDM simulation, and dot-dashed 
curves show the Hubble expansion velocity. Crosses show the result of using the 
[210] formulae for the correlation function in the fitting formula provided by [142]. 
Solid curves show the halo model described here which accounts for the fact that the 
nonlinear evolution is different from what linear theory predicts, and then weights 
the linear and nonlinear scalings by the relative fractions of linear and nonlinear 
pairs. Dashed curves show the two contributions to the streaming motion in the halo 
model; the curves which peak at large r are for pairs in two different halos. Dotted 
curve shows the approximation of using the linear theory correlation function to 
model this two-halo term. 



In linear theory, <9£/<91na ~ 2/(f2 m )£, where f(fl m ) comes from the usual ap- 
proximation to the derivative of the growth function: f(£l m ) = <91nG/<91na m 
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f2„ 6 . Thus, in linear theory, 

v 12 (r) 2f(n m )£(r,a) 



Hr 3[l + £(r,a)] 



(149) 



On large scales where linear theory analyses can still be applied, £(r, a) <C 1. 
If we can drop this term from the denominator then this expression for the 
mean pairwise velocity is the same as that obtained directly from linear theory 
[102,202,142]. This linear theory expression underestimates velocities on small 
nonlinear scales by a factor of ~ 3/2. 

If we use the halo model decomposition £ = £ lh + £ 2 /i, an d then use the fact 
that ^2h scales like the linear theory correlation function, then equation (148) 
becomes 



Hr 3[l + £(r,a)] 



2f(Q m )U(r,a) + 



9£ 



lh 



d In a 



(150) 



The next step is to compute the derivative of the single halo term. Since £ lh 
depends on the halo mass function and density profiles, the derivative can be 
computed directly [173,258]: 



lh 



<91nm„ 



dlna d\na 

3 

+ 



[tih(r,a) -£ih(r,a) 

n{m) \{r\m) <91nA 
p dlna ' 



J dr'r' 2 J 



dm- 



(151) 



where X(r\m) denotes the convolution of the density profile with itself: 



X(r\m) = 2ir J dyy 2 p(y\m) J d(3 p( 



Z 171 ,2 =5 ,2, r 2 



z 2 =y 2 +r 2 — 2yrf3 ■ 



(152) 



Now, dlnX/dlna ~ (<91n A/<91nc)(<91nc/<91na) m / mt ; since the time depen- 
dence of c only comes from its dependence on m* and the derivative is taken 
by keeping m* constant, this term is zero. The piece which remains depends 
on d\nm*/d\na. If P(k) oc k n , then d\nm*/d\na = /(f2 m )6/(3 + n) and 
[258] 



V\2 

Hr 



/(«) 
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(153) 
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where n* = —1.53 is the slope of the power spectrum on scale m* for the 
ACDM cosmology. Figure 31 compares the mean pairwise velocities from this 
halo model calculation with measurements in numerical simulations. 



Extending the approach to estimate how the mean pairwise velocity of halos or 
of galaxies depends on scale requires modeling the halo or galaxy correlation 
function. This in turn requires estimating how the halo-mass dependent bias 
factors evolve. The evolution of the bias factors is straightforward to compute 
[260]. The resulting 2-halo contribution to v± 2 is 



where % represents a tracer of halo with a large scale bias 6; with respect to the 
linear density field. The 1-halo contribution to the pairwise peculiar velocities 
follow similar to the relation for dark matter in equation (151), but with the 
galaxy or halo correlation function substituted for the dark matter. 

When combined with the BBGKY hierarchy, the halo model of the two- and 
three-point correlation functions allows one to estimate how the pairwise ve- 
locity dispersion depends on scale. Although this calculation can, in principle, 
be done exactly, a considerably simpler but reasonably accurate approxima- 
tion is sketched in [259]. A halo model calculation of the full distribution of 
pairwise velocities on small scales is in [249] ; when combined with results from 
[253] and [259], it can be extended to larger scales, although this has yet to 
be done. 



7.4 Momentum and Velocity Power Spectra 

We have already provided an estimate of the mass weighted velocity dispersion 
(equation 146). Since mass times velocity defines a momentum, we will now 
study the statistics of the momentum field. Specifically, define the momentum 
p = (1 + 5)v. The divergence of the momentum is 



The first term involving the velocity field gives the contribution from the 
velocity field in the linear scale limit, 5 <C 1, while the non-linear aspects are 
captured in the term involving the convolution of the Sv term. We can write 
the power spectrum of the divergence of the momentum density field [174], as 




(154) 



ik-p(k)=ik- v(k) + 



(155) 
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Fig. 32. Three dimensional power spectrum of the momentum density field parallel 
(upper) and perpendicular (lower) to the wavevector k. The halo model estimate is 
compared to the numerical simulations and calculations based on 2nd order pertur- 
bation theory. Note that the parallel component contributes to the time-derivative 
of the density field (§ 9.3) through the continuity equation, whereas the perpen- 
dicular component (involving the momentum density field of baryons, rather dark 
matter) contributes to the kinetic Sunyaev-Zel'dovich effect (§ 9.2). This figure is 
from [174]. 

k 2 p pp (k) = k 2 p*:(k) + k 2 j —^p 55 (\k - k'\)p vv (k>) 

+ e I (^3 (fc |k-^ / pfo(|k " k ' l)p ^ (A;,) (156) 

In the non-linear regime integrating over angles yields, with k — k' ~ k, 

k 2 P pp (k) = k 2 P^(k) + / ^P w (fcO. (157) 

This latter result is similar to the one proposed by [119] to calculate the 
momentum density field associated with the baryon field. Here, one replaces 
the density field power spectrum with the non-linear power spectrum, either 
from the halo model or from perturbation theory. In figure 32, we summarize 
results from [174], which shows that the halo model calculation is in good 




1 n-a 
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agreement with the numerical measurements. 



The halo model provides a simple description why the above approach works 
[258]. In general, we can separate the contributions to 1- and 2-halo terms, so 
that 

P„(k)=P£(k) + P%(k). (158) 



Using equation (146), we can write the two terms by noting that 



k 2pih (k) _ k 2 p ih (k) , /■ m 2 n(m) k 2 a 2 alo (m) k 3 \u(kr viT \m) 



2 



P 2 ; 2 (fi)# 2 

k?P%(k)=k?P%(k) + k ]i^ Pts h (k), (159) 

where -u(/c) is the same density profile factor when computing the power in 
the density field. The second factor in the two-halo term comes from using the 
fact that Ohaio depends only weakly on m (see, figure 29), so we approximate 
it by setting it equal to its value at m*. Similarly, the 1 and 2-halo terms of 
the velocity power spectra are 



h 2 p ih (k , _ I , m 2 n(m) k 2 a 2 alo (m) k 3 W 2 (kr vir \m) 
k P vv (AO - / dm _ 2 — and 



2 

,2 r>2h/u\_ olin/ 



fc^(fc)=P' m (fc) 



/dm —W{kR\m) 

J (J 



(160) 



where W(x) is the Fourier transform of a tophat window, (r vir /i?) 3 = Q/A n i, 
with i?(m) = (3m/4vrp) 1/3 . 

In the halo model, the 1-halo contribution to the momentum density field is 
similar to the approximation introduced by [119], where one sets P pp (k) ~ 
P(k)Vft n where V { f n = J dkP hn (k)/2n 2 . The single-halo contribution integrates 
over the linear-theory velocity power spectrum that is smoothed with a filter 
at the scale of the initial size of the halo. Since the halo velocity is independent 
of mass (panel on right of Figure 29), one obtains a reasonably accurate result 
by simply setting <Jhaio to the value at m*. In this approximation, P pp (k) lh pa 
[&haio( m *) / fH] 2 P lh (k) and at non-linear scales, P(k) ~ P lh {k); thus, at non- 
linear scales, the ratio of power in momentum to velocity is a constant. 

Figure 33 compares this model with measurements in the GIF simulations. The 
turnover in the measurements at k ~ 5/i/Mpc in these figures is not real; it is 
due to the finite grid on which the power spectra have been evaluated. On the 
larger scales (smaller k) where the grid is not important, our model provides 
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k [/i/Mpc] k [/i/Mpc] 



Fig. 33. Power spectrum of the momentum: k 2 A pp (k). Filled circles show the sum 
of the power spectrum of the divergence (open circles) and the curl (crosses) of 
the momentum fields in the simulations. Dashed curves show the linear theory 
prediction, and solid curves show our nonlinear theory prediction for the total power. 
The solid curves are obtained by summing the dot-dashed curves, which represent 
the contributions to the power from the single-halo and two-halo terms discussed 
in the text. Bottom panel shows the fraction of the total power contributed by the 
divergence (open circles) and the curl (crosses) components, and solid lines show 
what our model predicts. 

a good description of the power spectrum of the momentum. The halo model 
predicts that the power spectrum of the curl should equal (2/3) k 2 P p 1 p h (k). In 
addition, one must account for the curl which comes from the second term in 
the two-halo contribution to the power. We have done this by assuming that 
two thirds of this term is in the curl component. The bottom panels show 
that this provides a good description of how the power is divided up between 
the divergence and the curl on small scales, but the agreement is not good on 
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large scales. Presumably this is because our assignment of 2/3 of the power 
from the second term in P££ is an overestimate. 

We can extend the discussion to also consider cross power spectra between 
velocities, momentum and the density fields. The 2-halo terms associated with 
these correlations are simple given the fact that all these three field trace each 
other: 



P%(k) = ^P%(k)Pi h (k), (161) 



and similarly for the other pairs. The single-halo terms are only slightly more 
complicated: 



TDih (v ;\- [ a ra 2 n(m) o- ha io(m) k 3 \u(kr vil \m)\\W(kR\m)\ 

p " {k) -J dm ^/^J(m w ' 

(h \ _ r>ih( h s f Arm m 2 n(m) o-j &lo (m) k 3 \u{kr vh \m)\\W{kR\m)\ 
(k)-P vv (k) + J dm ^^p m -2 ^ > 

p;m=pim + | dm ^y)^M^i«y)i\ (162) 



plh 

pv 



If one again ignores the weak mass dependence of £7h a i , Pps{k) ~ V rms P lh (k) 
at large k, where V Tms = o"haio(m*)- This closely resembles the correspond- 
ing approximation for the momentum spectrum: P pp (k) V? ms P{k), and so 
provides a simple way of using P(k) to estimate P pv . 

If we define R p s = P p s / J P pp Pss and similarly for the other pairs, then the 
expressions above show that R = 1 at small k. If we ignore the mass depen- 
dence of Ohaio, then R p $ m 1 at both small and large k, so it depends on scale 
only over a limited range of scales. Fig. 34 shows our predictions for P pv and 
R pv fit the simulations quite well; note that R pv is always quite close to unity, 
even at large k. 



7. 5 Redshift- Space Power Spectrum 



Our description of virial velocities provide a mechanism to calculate the red- 
shift space distortions in the non-linear regime of clustering. Following [145], 
we can write the redshift space fluctuation, 5*, of galaxy density field as 

5 z g (k) = 5 g (k) + 5y, (163) 
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Fig. 34. Cross-spectrum of the momentum and the velocity. Filled circles show A pp , 
triangles show A vv and stars show A pv . Dashed curves show the linear theory pre- 
diction, and dotted curves show our nonlinear theory predictions for the momentum 
and the velocity, and solid curves show our prediction for the cross spectrum. Bot- 
tom panel shows the ratio of the cross spectrum to the square root of the product 
of the individual spectra, and solid lines show what our model predicts. 

where 5 V is the velocity divergence and // = r • k. At linear scales, one can 
simplify the relation by noting that 5 g (k) = b g 5(k) and 5 V = f(il m )5(k) to 
obtain 

^(k)=5 s (k)[l + /V] (164) 



where (3 = f{Q m )/b g ; this parameter is of traditional interest in cosmology as it 
allows constraints to be placed on the density parameter fl m through clustering 
in galaxy redshift surveys. We refer the reader to a review by Strauss & Willick 
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Fig. 35. The ratio of power in redshift space compared to real space from the halo 
model (solid line) and from N-body simulations (data point). This figure is repro- 
duced from [290]. 

[270]), with recent applications related to the 2dFGRS survey in [207,213]. At 
linear scales, the distortions increase the power by a factor (1 + 2/3/3+ 1/5/3 2 ), 
which when b g = 1 is 1.41 for Q m = 0.35. At non-linear scales, virial velocities 
within halos modify clustering properties. With the description of the one 
dimensional virial motions, a, which can be described by a Gaussian, we write 



This allows us to write the power spectrum in redshift space as [245] 



(165) 



^(*0=O*)+O*) 



where 




R p (k(r)\u glll (k\m)\ p , 



(166) 



with 




F v = f{Vt m ) j dm n(m) bi(m) Ri(ka)u(k\m) , 



(167) 
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and 

R p (a = ka^) = y^ e -^±, (168) 

for p — 1,2. In equation (166), n ga \ denotes the mean number density of 
galaxies (equation 129). 

Even though peculiar velocities increase power at large scales, virial motions 
within halos lead to a suppression of power. In figure 35, we show the ratio 
of power in redshift space to that of real space for dark matter alone. Note 
the sharp reduction in power at scale corresponding to the 1-halo term of 
the power spectrum. When compared to the real space 1-halo contribution to 
the power spectrum, the redshift space 1-halo term is generally reduced. This 
partly explains the reason why perturbation theory works better in redshift 
space than in real space [290]. 



8 Weak Gravitational Lensing 

8. 1 Introduction 

Weak gravitational lensing of faint galaxies probes the distribution of mat- 
ter along the line of sight. Lensing by large-scale structure (LSS) induces 
correlation in the galaxy ellipticities at the percent level (e.g., [27,188,146] 
and recent reviews by [7,186]). Though challenging to measure, these corre- 
lations provide important cosmological information that is complementary to 
that supplied by the cosmic microwave background and potentially as precise 
(e.g., [133,14,152,147,234,121,48,282]). Indeed several recent studies have pro- 
vided the first clear evidence for weak lensing in so-called blank fields (e.g., 
[283,2,295,294]), though more work is clearly needed to understand even the 
statistical errors (e.g. [59]). 

Given that weak gravitational lensing results from the projected mass distribu- 
tion, the statistical properties of weak lensing convergence reflect those of the 
dark matter. Current measurements of weak lensing involve the shear, which 
is directly measurable through galaxy ellipticities, and constructed through a 
correction for the anisotropic point-spread function [150], or via a series of basis 
functions, called " shapelets" , that make use of information from higher order 
multipoles, beyond the quadrupole, to represent the galaxy shape [225,226]. In 
such galaxy shear data, statistical measurements include variance and shear- 
shear correlation functions; as we will soon discuss, these measurements are 
related to the power spectrum of convergence. Additionally, in shear data, the 
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convergence can be constructed through approaches such as the aperture mass 
[149,233]. Such a construction allows direct measurements of statistics related 
to convergence such as its power spectrum and higher order correlations. 

The halo approach to non-linear clustering considered in this review allows 
one to study various statistical measurements related to weak lensing. Ad- 
ditionally, one can use the halo model to investigate various statistical and 
systematic effects in current and upcoming data. For example, weak lensing 
surveys are currently limited to small fields which may not be representative of 
the universe as a whole owing to sample variance. In particular, rare massive 
objects can contribute strongly to the mean power in the shear or convergence 
but not be present in the observed fields. The problem is compounded if one 
chooses blank fields subject to the condition that they do not contain known 
clusters of galaxies. Through the halo mass function, we can quantify the ex- 
tent to which massive halos dominate the cosmological weak lensing effect 
and, thus, the required survey volume, or projected area on the sky, needed 
to obtain a fair sample of the large scale structure [59]. 

Non-linearities in the mass distribution also induce non-Gaussianity in the 
convergence distribution. These non-Gaussianities contribute to higher order 
correlations in convergence, such as a measurable skewness, and also contribute 
to the covariance of the power spectrum measurements. With growing obser- 
vational and theoretical interest in weak lensing, statistics such as skewness 
have been suggested as probes of cosmological parameters and the non-linear 
evolution of large scale structure [14,134,128,196,282]. Similarly, we can also 
consider the bispectrum of convergence, the Fourier analog of the three-point 
correlation function. Since lensing probes non-linear scales, the bispectrum or 
the skewness cannot be considered in perturbation theory alone as it is only 
applicable in the large linear scales. In fact, it has been well known that predic- 
tions based on perturbation theory underestimates the measured skewness in 
numerical simulations of lensing convergence [289]. The halo model provides a 
simple analytic technique to extend the calculations to the non-linear regime 
and predictions based on the halo model are consistent with the numerical 
simulations [54]. 

In terms of the power spectrum covariance, the non-Gaussian contribution 
arise from the four-point correlation function or the trispectrum in Fourier 
space. These non-Gaussian contributions are especially significant if observa- 
tions are limited to small fields of view such that power spectrum measure- 
ments are made with wide bins in multipole, or Fourier, space. Similar to 
the PThalo approach that allowed measurements of the covariance related to 
galaxy two-point correlation function [241], the halo approach provides an ana- 
lytical scheme to estimate the covariance of binned power spectrum of shear or 
convergence, based on the non-Gaussian contribution. The calculation related 
to the convergence covariance requires detailed knowledge on the dark matter 
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Fig. 36. Weak lensing convergence power spectrum under the halo description. Also 
shown is the prediction from the PD non-linear power spectrum fitting function. We 
have separated individual contributions under the halo approach. We have assumed 
that all sources are at z s = 1. 

density trispectrum, which can be obtained analytically through perturbation 
theory (e.g., [14]) or numerically through simulations (e.g., [134,289]). Since 
numerical simulations are limited by computational expense to a handful of 
realizations of cosmological models with modest dynamical range, approaches 
such as the halo model is useful for speedy calculations with accuracies at the 
level of few tens of percent or better. 



8.2 Convergence 



Weak lensing probes the statistical properties of shear field and we can write 
the deformation matrix that maps Ax separation vector between source, s, 
and image, i, planes, Axi s = AjjAxj 1 , as 

(1 — k — 71 —72 — UJ 
-72 + UJ 1 - K + 71 

Here k is the convergence, responsible for magnification and demagnification, 
w is the net rotation of the image, and we have separated the components of 
the shear, 7 = 7! + 272, which translates as a spin-2 field 7 = I7I e 2 *^ and is a 
pseudo vector field. The shear components are 
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Fig. 37. Weak lensing convergence (a) bispectrum and (b) trispectrum under the 
halo description. We have separated individual contributions under the halo ap- 
proach to 3 halos in the case of bispectrum and 4 halos in the case of trispectrum. 
We have assumed that all sources are at z s = 1. 



71 = g (^ii ~~ M 

1 

72 = 2(^12 + ^21) (170) 



where, 



d A (r d )d A (r d - r s 
d A {r s ) 
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To the smallest order in potential fluctuations, w ~ and we can ignore 
the asymmetry associated with the deformation matrix; thus, 72 = ^12 as 
commonly known. 

We can write the convergence using the trace of the deformation matrix with 
« = 5(^11 +^22): 

«(n) = y rfrW^ lcns (r)Vi$(r,rn) , (172) 


where the lensing visibility function for a radial distribution of background 
sources, n s (r), is 

W~(r) = /• ***fl- r W ) ■ (173) 

Here, r is the comoving radial distance (equation 2) and is the angular 
diameter distance (equation 3). 

8.3 Power spectrum 

We can write the angular power spectrum of convergence by taking the spher- 
ical harmonics 

«(n) = 5>imy, m (A) > (174) 
with spherical moments of the convergence field defined such that 

Kim = i'/ |^l $ ( k ) / drW(r) 3l (kr)Yr(k) , 

(175) 

where W r (fc, r) is the visibility function associated with weak lensing defined 
in equation (173). Here, we have simplified using the Rayleigh expansion of a 
plane wave 

e ik-nr = 4n^i l j l (kr)Y? n *(k)Y{ n (n) . (176) 

lm 

In the small scale limit, only the modes perpendicular to the radial direc- 
tion contribute to the integral in equation (175) while others are suppressed 
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through near-perfect cancellation of positive and negative oscillations along 
the line of sight. Thus, we can replace k± ~ k, which only makes an error of 
order $ ~ 10~ 5 . Further, we can use the Poisson equation (equation 29) to 
relate potential fluctuations to those of density, k 2 &(k) = 4irGpa 2 5(k). These 
allow us to construct the angular power spectrum of the convergence, defined 
in terms of the multipole moments, Ki m , as 

to obtain 

oo r r () 

Q = -J k 2 dkP(k) J dn J driW^ir^W^i^jtikrMkri) (178) 





where 



When all background sources are at a distance of r s , n s (r') = 5 D (r' — r), the 
weight function reduces to 

W^{r) = \ nm ^ i ^f-- r) . (180) 
2 c 2 a d A {r s ) 



In the case of a non-flat geometry, one needs to introduce curvature correc- 
tions to the Poisson equation (see, equation 29), and replace the radial Bessel 
functions, ji, with hyperspherical Bessel functions. In the small scale limit, for 
efficient calculational purposes, we can simplify further by using the Limber, 
or small angle, approximation [168] where one can neglect the radial com- 
ponent of the Fourier mode k compared to the transverse component. Here, 
we employ a version based on the completeness relation of spherical Bessel 
functions (see, [54,122] for details) 

J dkk?F(k)j l (kr)j l (kr f ) w ldf5 D (r - r')F{k)\ k= _i_ , (181) 



where the assumption is that F(k) is a slowly-varying function. Under this 
assumption, the contributions to the power spectrum come only from corre- 
lations at equal time surfaces. Finally, we can write the convergence power 
spectrum as [146,147]: 
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8.4 Relation to Shear Correlations 



We can consider the relation between convergence power spectrum and shear 
correlation functions by considering the Fourier decomposition of the shear 
field [126] to a gradient-like (E-modes) and curl-like (B-modes) components: 



7l (n)±* 72 (n) = /t|^K1)±/3(1)] 



(2tt) s 



e ±2i(f>i gil-n 



(183) 



and consider the correlations between (7171), (7172) and (7272)- We can write 
these correlation functions as 



(7i(n i )7i(n j )) = 

(71(^)72(^0) =/ 
(72(n i )7 2 (n j )) = 



d 2 l 

d 2 \ 

(27T) 2 

(2vr) 2 



Cf cos 2 20; + C^ p sin 



W ■ 2 



C, 



1— sin 40; ^— sin 40; + C\ p cos 



Cf sin 40; 



Cf sin 2 20; + C; pp cos" 20; + C; ep sin 40 



e 0. 



e »l-(n»-nj) 

(184) 



Using the expansion of e*^ -6 ^ = Em'" 1 ^™^)^ 1 ', in terms of the 
magnitude 6* and orientation of vector fij — n^, we write 



(7171) <M 

(7172) 0,4 
(1272)9,4 



Idl 
An 



{Cf e [J (W) + cos(40) J 4 (Z0)] + C? P [J Q {10) - cos(40) J 4 (Z0)] 
-2Cf sin(40)J 4 (^)} 



— {Cfsin(40) J 4 (Z0) - Cf p sm(ty) J 4 (W) + Cf 2 cos(40) J 4 (Z0)} 

= J l -^{ C tVo{lO) - cos(40) J 4 (W)} + Cf p [J (W) + cos(40) J 4 (Z0)] 
+2^ e/3 sin (40) J 4 (/#)}. 

(185) 



One can choose an appropriate coordinate system such that measured cor- 
relation functions in the coordinate frame are independent of the choice of 
coordinates; in the above derivation, this is equivalent to setting = (e.g, 
[269]). To do this in practice, in analogy with CMB polarization (see, [151]), 
shear can be measured parallel and perpendicular to the line joining the two 
points, such that fij — n, || x. In such a coordinate system correlation functions 
reduce to the well known result of [188,146]: 
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(7i7i>* = / ^ {ctVSO) + Hie)] + cfVo(w) - ^ 

(7i7 2 ) e = / ^Cf. 7,(19) 

(7272). = / ^ {CfWW) - + ^[JoCZfl) + MW)}} 



(186) 

To the first order, contributions to the shear correlations primarily come from 
perturbations involving scalars, or gradient-like modes, with Cf e = Cf and 
Cf = Cf e = 0; even if contributions are non-zero, the latter Cf e is zero 
due to parity invariance. 

The curl-like modes in shear can be generated by tensor perturbations such as 
gravity-waves. Since there is no appreciable source of primordial gravity-wave 
perturbations at late times (see, [154] for a review), it is unlikely that there is 
a significant contribution to Cf , except in two cases: 

(1) The first order calculation of weak lensing distortion matrix and conver- 
gence is that we have implicitly integrated over the unperturbed photon paths 
(the use of so-called Born approximation, see [14,234]). Similarly, second-order 
effects such as lens-lens coupling involving lenses at two-different redshifts 
can generate a curl-like contribution. With second-order corrections to Born 
approximation and lens-lens coupling, we can write the deformation matrix 
associated with weak lensing as 



Mi = ^-^§-^§ (187) 

with 



(2)_, f ^jMxVAjx-x') 



dA(x) 



x j dx Mx'')d^ x") dtdkHxWMx ^ > (188) 

due to lens-lens coupling and 

x j d x "d A ( x ' - x'WAHMHx") , (189) 

due to a correction to Born approximation, respectively [234,14,57]. The result- 
ing deformation matrix due to these second-order corrections is asymmetric 
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Fig. 38. The power spectra of lensing convergence (solid) and Cf e (dashed) and 
Cj (dot-dashed) due to galaxy ellipticity correlations induced by tidal torques, 
as a function of redshift. The tidal torques induce significant correlations at low 
redshifts, while these can be ignored for deep weak lensing surveys with background 
sources at z > 1. The figure is from [175]. 



and results in a contribution to C[ r , as well as a contribution to the net ro- 
tation; the latter is equivalent to the Stokes-V contribution in a polarization 
field or, equivalently, circular polarization. 



The Born approximation and lens-lens coupling have been tested in numeri- 
cal simulations by [134] where they evaluated contribution to the convergence 
power spectrum resulting from second order effects. Here, the rotational con- 
tribution to angular power spectrum, due to lens-lens coupling, is roughly 3 
orders of magnitude smaller. In [57], it was shown that the corrections due to 
the Born approximation is also smaller compared to the first order result that 
Cf = 0. 



(2) The intrinsic correlations between individual background galaxy shapes, 
due to long range correlations in the tidal gravitational field in which the halos 
containing galaxies formed, can generate a contribution to (7j [60,110,37,175]. 
The intrinsic correlations have a redshift dependence such that they are signif- 
icant at low redshifts. In figure 38, we show the resulting C[ e and Cf power 
spectra due to ellipticity alignments in background galaxies arising from tidal 
torques and a comparison to convergence power spectrum associated with 
ellipticity correlations due to lensing following [175]. In order to avoid the 
confusion between lensing generated ellipticity correlations vs. tidal torques 
induced correlations, the results from intrinsic alignment calculations gener- 
ally indicate that deep surveys are preferred over shallow ones for cosmological 
purposes. We will return to this issue again based on the non-Gaussian con- 
tribution to convergence power spectrum covariance. 
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Fig. 39. Moments of the convergence field as a function of top-hat smoothing scale a 
with (a) Second moment broken into individual contributions and (b) Third moment 
broken into individual contributions. 

In addition to shear correlations, one can also measure the shear variance, 
which can be related to the convergence power spectrum by 

( 7 V)) = («"(*)) = i- £(2Z + l)C?W?{a) , (190) 

where Wi are the multipole moments, or Fourier transform in a flat-sky ap- 
proximation, of the window. In figure 39(a), we choose a window which is a 
two-dimensional top hat in real space with a window function in multipole 
space of Wi(a) = 2J 1 (x)/x with x = la. As shown, at 5' to 90' angular scales, 
most of the contribution to the second moment comes from the double halo 
correlation term and is dominated by the linear power spectrum instead of the 
non-linear evolution. 

In figure 36(a), we show the convergence power spectrum of the dark matter 
halos compared with that predicted by the [210] fitting function for the non- 
linear dark matter power spectrum. The lensing power spectrum due to halos 
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has the same behavior as the dark matter power spectrum. At large angles, 
I < 100, the correlations between halos dominate. The transition from linear 
to non-linear is at I ~ 500 where halos of mass similar to M*{z) contribute. 
The single halo contributions start dominating at I > 1000. When / > few 
thousand, at small scales corresponding to deeply non-linear regime, the shot- 
noise behavior of the background sources contribute to the convergence power 
spectrum via a noise term 

Cf N = . (191) 



Here, (7i n t) 1 ^ 2 is the rms noise per component introduced by intrinsic ellip- 
ticities, typically ~ 0.6 for best ground based surveys, and n is the surface 
number density of background source galaxies. 

Note that the shot-noise term is effectively reduced by the number of indepen- 
dent modes one measures at each multipole. Including the sample variance, the 
total error expected for a measurement of the power spectrum, as a function 
of multipole, is 

AC -\Brry[ c '" + c ' SN ]- (192) 



Here, the first term represents the sample variance under the Gaussian ap- 
proximation for the convergence field, K\ m and is the dominant source of noise 
at large angular scales. The factor / S k y , fraction of the sky observed, accounts 
for the reduction in the number of independent modes under the partial sky 
coverage. In the absence of noise for an all-sky experiment, at a multipole of 
~ 100, the error on the power spectrum due to sample variance is ~ 10% 
and is usually reduced with binned measurements of the power spectrum in 
multipole space. 

For surveys that reach a limiting magnitude in R ~ 25, the surface density 
is consistent with n ~ 6.9 x 10 8 sr _1 (« 56 gal arcmin~ 2 ) [263], such that 
Cf N ~ 2.3 x 10~ 10 . This shot-noise contribution reaches the power due to 
convergence at multipoles of ~ 2000 and dominates the cosmological weak 
lensing signal at multipoles thereafter. It is clear that the convergence power 
spectrum at multipoles of few thousand probe the small scale behavior of the 
dark matter power spectrum. The presence of significant shot-noise, however, 
complicate studies that can potentially test assumptions related to large scale 
structure, such as the stable clustering hypothesis, or the halo model, such as 
the use of smooth profiles in the presence of substructure seen in numerical 
simulations. 
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Fig. 40. The bispectrum configuration dependence as a function of l\ and I2 
with I3 = 1000. Due to triangular conditions associated with Z's, only the upper 
triangular region in I1-I2 space contribute to the bispectrum. 

8. 5 Bispectrum 



Using the spherical harmonic moments of convergence defined in equation (174), 
the angular bispectrum of the convergence is defined following [54,265] as 

l\ I2 I3 

\ K hm 1 K l2m 2 K hmz) = \ ] ^« 2 i3 " (193) 

mi rri2 



Here, the quantity in parentheses is the Wigner 3j symbol. Its orthonormality 
relation implies 



E 




(^iimi ^121712 ^l^ms, ) 



(194) 



The angular bispectrum, -B^ 2 ; 3 , contains all the information available in the 
three-point correlation function, For example, the third moment or the skew- 
ness, the collapsed three-point expression of [113] and the equilateral con- 
figuration statistic of [81] can all be expressed as linear combinations of the 
bispectrum terms (see, [90] for explicit expressions). 

Similar to our discussion related to the convergence power spectrum, we can 
write spherical moments of the convergence field defined with respect to the 
density field as 
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Klm = %l I ^^ k ) / ' CnS ( A; ) r "^) and / ' 1CnS ( A; ) = / dr W lcns (k, r) 3l (kr) , 

(195) 

where W(k,r) is the source function associated with weak lensing (see equa- 
tion 179). 

The bispectrum can be constructed through 

(K hmi K hm2 K l3m3 ) =* h+h+h 1^ J J ^ (6(k 1 )6(k 2 )6(k 3 )) 
x /^( fc Jj^(^)/^(^)l^(kOl^(^)ir(^), (196) 

and can be simplified further by using the bispectrum of density fluctuations 
to write the convergence bispectrum as 



, h h ^3 i 

m 1 m 2 m:i \ mi m 2 ^3 



/nf=i(2/, + i) (h h h 





Atx 



with 



b hhM = ^ J k\dk x J k\dk 2 J kldk 3 B(k 1 , k 2 , k 3 ) 

X 4 enS (^)C S (^)4 enS (^3) / X 2 dx 3h (k lX ) 3h (k 2 x) 3h (hx) . 

(198) 



In general, the calculation of bi lt i 2 j 3 involves seven integrals involving the mode 
coupling integral and three integrals involving distances and Fourier modes, 
respectively. We can simplify further by employing the Limber approximation 
similar to our derivation of the power spectrum. Applying equation (181) to 
the integrals involving k±, k 2 and k 3 allows us to write the angular bispectrum 
of lensing convergence as 



/ ni^ + l) (hhh \ r [W^(r)f (h l 2 h 

(199) 
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Through angular momentum selection rules, the Wigner-3j symbol restricts li 
to form a triangle such that Zj < \lj — Z&|. Additional properties of the Wigner 
3j symbol can be found in the Appendix of [54]. 

The more familiar flat-sky bispectrum is [55,120]: 

where lj are now two-dimensional vectors. In the case of the flat-sky bis- 
pectrum, the Wigner 3j symbol in the all sky expression becomes a trian- 
gle equality relating the two-dimensional vectors. The implication is that the 
triplet (h,l2,h) can be considered to contribute to the triangle configuration 
of li, 1 3 , 1 3 = — (li + 1 2 ) where the multipole number is taken as the length of 
the vector. The correspondence between the all-sky derivation, equation (199), 
and the flat-sky approximation, equation (200), can be noted by expanding 
the delta function involved with li + 1 2 + I3 = [120]. 

In the flat-sky case, we can generalize our result for a any n-point Fourier 
space correlation as 

P N (h, In) = J dr [W ^p$ N p N Qj-, ^; r) , (201) 

where vectors li + ... + Ijy = 0. 

Similar to the density field bispectrum, we define 




(202) 



involving equilateral triangles in /-space. In figure 37(a), we show Ag qZ . The 
general behavior of the lensing bispectrum can be understood through the 
individual contributions to the density field bispectrum: at small multipoles, 
the triple halo correlation term dominates, while at high multipoles, the single 
halo term dominates. The double halo term contributes at intermediate Z's 
corresponding to angular scales of a few tens of arcminutes. In figure 40, we 
plot the configuration dependence 



Rhh ~2n (2 ° 3) 

as a function of l\ and Z 2 when Z 3 = 1000. The surface, and associated contour 
plot, shows the contribution to the bispectrum from triangular configurations 
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in /-space relative to that from the equilateral configuration. Because of the 
triangular conditions associated with Vs, only upper triangular region of Z1-Z2 
space contribute to the bispectrum. The symmetry about the l\ = / 2 line is 
due to the intrinsic symmetry associated with the bispectrum. Although the 
weak lensing bispectrum peaks for equilateral configurations, the configuration 
dependence is weak. In the case of dark matter bispectrum, it is now known 
that the halo model somewhat overestimates the configuration dependence 
due to the spherical assumption for halos [236]. This overestimate should also 
be present in the projected dark matter statistics such as lensing convergence 
bispectrum. 

As discussed in the case of the second moment, it is likely that the first mea- 
surements of higher order correlations in lensing would be through real space 
statistics. Thus, in addition to the bispectrum, we also consider skewness, 
which is associated with the third moment of the smoothed map (c.f. equa- 
tion. [190]) 



K 



AiT hu 3 v 47r 1 



(204) 



One can then construct the skewness as 



W = ^t, (205) 



where (k 2 (<t)) is the second moment of the convergence field defined in equa- 
tion (190). 

In figure 41, we plot the skewness based on the halo model. Here, we show 
skewness as a function of maximum mass, ranging from 10 14 to 10 16 Mq (from 
increasing values of skewness). The assumption is that certain surveys, either 
by design in the case of so-called blank-fields or by chance, will not contain mas- 
sive halos in the universe. Thus, by arbitrarily cutting off the maximum mass 
when integrating over the mass function, one can estimate how the statistics 
are sensitive to the presence of the massive and rare objects in the universe. 
Our total maximum skewness agrees with what is predicted by numerical par- 
ticle mesh simulations [289] and yields a value of ~ 116 at 10'. However, it 
is lower than predicted by HEPT arguments and simulations of [134], which 
suggest a skewness of ~ 140 at angular scales of 10' [128]. The HEPT predic- 
tion generally overpredicts skewness as it is extended to the mildly non-linear 
regime of clustering, where contributions to the skewness come from at ar- 
cminute scales, from the deeply non-linear regime, corresponding to angular 
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Fig. 41. The skewness, Ss(a), as a function of angular scale. The filled symbols 
indicate the mean and variance computed from a set of k planes generated in parti- 
cle-mesh (PM) simulations by [289] . Under the halo model, shown here is the skew- 
ness with varying maximum mass used in the calculation (solid lines ranging from 
10 14 to 10 16 M Q ). For comparison, we also show skewness values as predicted by 
hyper-extended perturbation theory (HEPT) and second-order perturbation theory 
(PT). Figure is reproduced based on [289] and [55]. 

scales of few arcseconds, where it is expected to be valid. The skewness based 
on second-order PT [14] is lower than the maximum skewness predicted by 
halo calculation, and by construction, agrees with the skewness in the linear 
regime. 

The effect of maximum mass on the skewness is interesting. When the max- 
imum mass is decreased to 10 15 Mq from the maximum mass value where 
skewness saturates (~ 10 16 Mq), the skewness decreases from ~ 116 to 98 at 
an angular scale of 10', though the convergence power spectrum only changes 
by less than few percent when the same change on the maximum mass used 
is made. When the maximum mass used in the calculation is 10 14 Mq, the 
skewness at 10' is ~ 40, which is roughly a factor of 4 decrease in the skewness 
from the total. 

Thus, the absence of rare and massive halos in observed fields will certainly 
bias the skewness measurement from the cosmological mean, which has been 
suggested as a probe of the cosmological matter density given that S 3 oc 
f2~ ' 8 [14]. One, therefore, needs to exercise caution in using the skewness 
to constrain cosmological models [128]. Still, this does not mean that non- 
Gaussianity measured in small fields, where there is likely to be a significant 
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bias due to the lack of massive halos, will be useless. With the halo approach, 
one can calculate the expected skewness given some information related to 
the mass distribution of halos within the observed fields. This knowledge may 
come externally, such as through X-ray and Sunyaev-Zel'dovich measurements 
or internally from lensing data themselves, independent of cosmology. Alter- 
natively, if cosmology is assumed, one can also used non-Gaussian information 
from weak lensing to constrain some aspect of the large scale structure halo 
mass distribution, such as the high mass end of the mass function. 



8.6 Weak Gravitational lensing Covariance 



For the purpose of this calculation, we assume that upcoming weak lensing 
convergence power spectrum will measure binned logarithmic band powers at 
several l^s in multipole space with bins of thickness 8k. 

r d 2 l P 

Ci = J —-k(1)k(-1), (206) 

si 



where A s (li) = J d 2 l is the area of the two-dimensional shell in multipole and 
can be written as A s (li) = 2nliSli + -n(5li) 2 . 

We can now write the signal covariance matrix as 



A ■ ■> 



- = ffk f 



.t k (Y -1 1 -I ) 



(207) 
(208) 



where A = 47r/ s k y is the area of the survey in steradian, when the fraction 
of sky covered is / S k y . Again the first term is the Gaussian contribution to 
the sample variance and the second term is the non-Gaussian contribution. A 
realistic survey will also have shot noise variance due to the finite number of 
source galaxies in the survey. Note that in the Gaussian limit with T£ = 0, 
when Sli = 1, equation (208) reduces to (ACf) 2 given in equation (192). 

Following equation (201), the convergence trispectrum is related to the density 
trispectrum by the projection [235,56] 



d\ \d A ' (La &a &a , 



(209) 



with the weight function defined in equation (179) and 1 4 = — (li + 1 2 + I3). 
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Note that the configurations, in Fourier space, that contribute to the power 
spectrum covariance has the form of parallelograms with 1 2 = — li and I4 = 
— 1 3 . Thus, it is useful to consider the behavior of the trispectrum for such 
configurations. In figure 37(b), we show the scaled trispectrum 

A«(o = |^r(i,-i,i±,-L) 1/3 - (210) 



where l± = I and 1 • lj_ = 0. The projected lensing trispectrum again shows 
the same behavior as the density field trispectrum with similar conditions on 
Vs. 

We can now use this trispectrum to study the contributions to the covariance, 
which is what we are primarily concerned here. In figure 42a, we show the 
fractional error, 




(211) 



for bands U given in Table 3 following the binning scheme used by [289] on 
6° x 6° fields. 

The dashed line compares that with the Gaussian errors, involving the first 
term in the covariance (equation 208). At multipoles of a few hundred and 
greater, the non-Gaussian term begins to dominate the contributions. For 
this reason, the errors are well approximated by simply taking the Gaussian 
and single halo contributions. In figure 42(b), we compare these results with 
those of the [289] simulations. The decrease in errors from the simulations at 
small / reflects finite box effects that convert variance to covariance as the 
fundamental mode in the box becomes comparable to the bandwidth. 



The correlation between the bands is given by 



= (212) 



In table 3, we compare the halo predictions to the simulations by [289]. The 
upper triangle here is the correlations under the halo approach, while the 
lower triangle shows the correlations found in numerical simulations. The cor- 
relations along individual columns increase, as one goes to large Vs or small 
angular scales, consistent with simulations. In figure 43, we show the corre- 
lation coefficients with (a) and without (b) the Gaussian contribution to the 
diagonal. 
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Fig. 42. The fractional errors in the measurements of the convergence band powers. 
In (a), we show the fractional errors under the Gaussian approximation, the full halo 
description, the Gaussian plus single halo term, and the Gaussian plus shot noise 
term (see equation 218). As shown, the additional variance can be modeled with the 
single halo piece while shot noise generally becomes dominant before non-Gaussian 
effects become large. In (b), we compare the halo model with simulations from [289] 
(1999). The decrease in the variance at small I in the simulations is due to the 
conversion of variance to covariance by the finite box size of the simulations. 



We show in figure 43(a) the behavior of the correlation coefficient between a 
fixed lj as a function of Zj. When Zj = lj the coefficient is 1 by definition. Due to 
the presence of the dominant Gaussian contribution at Zj = lj, the coefficient 
has an apparent discontinuity between Zj = lj and Zj = Zj_i that decreases as 
lj increases and non-Gaussian effects dominate. 
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Fig. 43. (a) The correlation coefficient, C%j as a function of the multipole h with 
lj as shown in the figure. We show the correlations calculated with the full halo 
model and also with only the single halo term for lj = 77072. In (b), we show the 
non-Gaussian correlation coefficient Cjj G , which only involves the trispectrum (see, 
equation 213). The transition to full correlation is due to the domination of the 
single halo contribution. 

To better understand this behavior it is useful to isolate the purely non- 
Gaussian correlation coefficient 



C 



NG 



T 



'J 



TaTij 



(213) 



As shown in figure 43(b), the coefficient remains constant for <C lj and 
smoothly increases to unity across a transition scale that is related to where 
the single halo terms starts to contribute. A comparison of figure 43(b) and 
37(b), shows that this transition happens around I of few hundred to 1000. 
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Table 3 



Weak Lensing Convergence Power Spectrum Correlations 



^•bin 


97 


138 


194 


271 


378 


529 


739 


1031 


1440 


2012 


97 


1.00 


0.04 


0.05 


0.07 


0.08 


0.09 


0.09 


0.09 


0.08 


0.08 


138 


(0.26) 


1.00 


0.08 


0.10 


0.11 


0.12 


0.12 


0.12 


0.11 


0.11 


194 


(0.12) 


(0.31) 


1.00 


0.14 


0.17 


0.18 


0.18 


0.17 


0.16 


0.15 


271 


(0.10) 


(0.21) 


(0.26) 


1.00 


0.24 


0.25 


0.25 


0.24 


0.22 


0.21 


378 


(0.02) 


(0.09) 


(0.24) 


(0.38) 


1.00 


0.33 


0.33 


0.32 


0.30 


0.28 


529 


(0.10) 


(0.14) 


(0.28) 


(0.33) 


(0.45) 


1.00 


0.42 


0.40 


0.37 


0.35 


739 


(0.12) 


(0.16) 


(0.17) 


(0.34) 


(0.38) 


(0.50) 


1.00 


0.48 


0.45 


0.42 


1031 


(0.15) 


(0.18) 


(0.15) 


(0.27) 


(0.33) 


(0.48) 


(0.54) 


1.00 


0.52 


0.48 


1440 


(0.18) 


(0.15) 


(0.19) 


(0.19) 


(0.32) 


(0.36) 


(0.53) 


(0.57) 


1.00 


0.54 


2012 


(0.19) 


(0.22) 


(0.16) 


(0.32) 


(0.27) 


(0.46) 


(0.50) 


(0.61) 


(0.65) 


1.00 



NOTES. — Covariance of the binned power spectrum when sources are at a redshift 
of 1. Upper triangle displays the covariance found under the halo model. Lower 
triangle (parenthetical numbers) displays the covariance found in numerical simu- 
lations by [289]. To be consistent with these simulations, we use the same binning 
scheme as the one used there. 



Once the power spectrum is dominated by correlations in single halos, the fixed 
profile of the halos will correlate the power in all the modes. The multiple halo 
terms on the other hand correlate linear and non-linear scales but at a level 
that is generally negligible compared with the Gaussian variance. 

Note that the behavior seen in the halo based covariance, however, is not 
present when the covariance is calculated with hierarchical arguments for the 
trispectrum (see, [235]). With hierarchical arguments, which are by construc- 
tion only valid in the deeply non-linear regime, one predicts correlations which 
are, in general, constant across all scales and shows no decrease in correlations 
between very small and very large scales. Such hierarchical models also vio- 
late the Schwarz inequality with correlations greater than 1 between large and 
small scales (e.g., [235,104]). The halo model, however, shows a decrease in 
correlations similar to numerical simulations suggesting that the halo model, 
at least qualitatively, provides a better approach to studying non-Gaussian 
correlations in the translinear regime. 



8.6.1 Scaling Relations 

To better understand how the non-Gaussian contribution scale with our as- 
sumptions, we can consider the ratio of non-Gaussian variance to the Gaussian 
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variance 




(214) 



with 



R = 




(215) 



(2vr) 2 2C 2 



Under the assumption that contributions to lensing convergence can be written 
through an effective distance r*, at half the angular diameter distance to 
background sources, and a width Ar for the lensing window function, the 
ratio of lensing convergence trispectrum and power spectrum contribution to 
the variance can be further simplified to 



Since the lensing window function peaks at r*, we have replaced the integral 
over the window function of the density field trispectrum and power spectrum 
by its value at the peak. This ratio shows how the relative contribution from 
non-Gaussianities scale with survey parameters: (a) increasing the bin size, 
through A si (oc 51), leads to an increase in the non-Gaussian contribution 
linearly, (b) increasing the source redshift, through the effective volume of 
lenses in the survey (V e g ~ r 2 Ar), decreases the non-Gaussian contribution, 
while (c) the growth of the density field trispectrum and power spectrum, 
through the ratio T/P 2 , decreases the contribution as one moves to a higher 
redshift. The volume factor quantifies the number of foreground halos in the 
survey that effectively act as gravitational lenses for background sources; as 
the number of such halos is increased, the non-Gaussianities are reduced by 
the central limit theorem. 

In figure 44, we summarize our results as a function of source redshift with 
U ~ 10 2 , 10 3 and 10 4 and setting the bin width such that A s (li) ~ If, or 
51 ~ I. As shown, increasing the source redshift leads to a decrease in the non- 
Gaussian contribution to the variance. The prediction based on the simplifi- 
cations in equation (216) tend to overestimate the non-Gaussianity at lower 
redshifts while underestimates it at higher redshifts, though the exact transi- 
tion depends on the angular scale of interest; this behavior can be understood 
due to the fact that we do not consider the full lensing window function but 
only the contributions at an effective redshift, midway between the observer 
and sources. 



R ~ 



A sl T(r*) 



(216) 



(27r) 2 Kfr2P 2 (rJ- 
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In order to determine whether its the increase in volume or the decrease in 
the growth of structures that lead to a decrease in the relative importance 
of non-Gaussianities as one moves to a higher source redshift, we calculated 
the non-Gaussian to Gaussian variance ratio under the halo model for several 
source redshifts and survey volumes. Up to source redshifts ~ 1.5, the in- 
crease in volume decreases the non-Gaussian contribution significantly. When 
surveys are sensitive to sources at redshifts beyond 1.5, the increase in volume 
becomes less significant and the decrease in the growth of structures begin to 
be important in decreasing the non-Gaussian contribution. Since, in the deeply 
non-linear regime, T / P 2 scales with redshift as the cube of the growth factor, 
this behavior is consistent with the overall redshift scaling of the volume and 
growth. 

The importance of the non-Gaussianity to the variance also scales linearly 
with bin width. As one increases the bin width the covariance induced by the 
non-Gaussianity manifests itself as increased variance relative to the Gaus- 
sian case. The normalization of R is therefore somewhat arbitrary in that it 
depends on the binning scheme, i.e. R <C 1 does not necessarily mean non- 
Gaussianity can be entirely neglected when summing over all the bins. The 
scaling with redshift and the overall scaling of the variance with the survey 
area A is not. One way to get around the increased non-Gaussianity asso- 
ciated with shallow surveys, is to have it sample a wide patch of sky since 
Cjj oc (1 + R)/A. This relation tells us the trade off between designing an 
survey to go wide instead of deep. One should bear in mind though that not 
only will shallow surveys have decreasing number densities of source galaxies 
and hence increasing shot noise, they will also suffer more from the decreas- 
ing amplitude of the signal itself and the increasing importance of systematic 
effects, including the intrinsic correlations of galaxy shapes (e.g., [37,60,110]). 
These problems tilt the balance more towards deep but narrow surveys than 
the naive statistical scaling would suggest. 



8.6.2 The effect of non-Gaussianities 

With steady improvements in the observational front, it is likely that weak 
lensing will eventually reach its full ability as a complimentary probe of cos- 
mological parameters when compared to angular power spectrum of CMB 
anisotropics (see, e.g., [121]). Thus, for a proper interpretation of observa- 
tional measurements of lensing convergence power spectrum or shear correla- 
tion functions, it will be essential to include the associated full covariance or 
error matrix in upcoming analyses. In the absence of many fields where the 
covariance can be estimated directly from the data, the halo model provides a 
useful, albeit model dependent, quantification of the covariance. As a practical 
approach one could imagine taking the variances estimated from the survey 
under a Gaussian approximation, but which accounts for uneven sampling and 
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Z s 

Fig. 44. The ratio of non-Gaussian to Gaussian contributions, R, as a function of 
source redshift (z s ). The solid lines are through the exact calculation (equation 215) 
while the dotted lines are using the approximation given in equation (216). Here, 
we show the ratio R for three multipoles corresponding to large, medium and small 
angular scales. The multipole binning is kept constant such that SI ~ I. Decreasing 
this bin size will linearly decrease the value of R. 



edge effects [126], and scaling it up by the non-Gaussian to Gaussian variance 
ratio of the halo model along with inclusion of the band power correlations. 
Additionally, it is in principle possible to use the expected correlations from 
the halo model to decorrelate individual band power measurements, similar 
to studies involving CMB temperature anisotropy and galaxy power spectra 
(e.g., [103,105]). 

The resulting non-Gaussian effects on cosmological parameter estimation was 
discussed in [56]. In [121], the potential of wide-field lensing surveys to measure 
cosmological parameters was investigated using the Gaussian approximation 
of a diagonal covariance and Fisher matrix techniques. The Fisher matrix is 
simply a projection of the covariance matrix, C, onto the basis of cosmological 
parameters Pi 
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where the total covariance includes both the signal and noise covariance. Under 
the approximation of Gaussian shot noise, this reduces to replacing Cf — ■> 
Cf + Cf N in the expressions leading up to the covariance equation (208). 
In the case where non-Gaussian contribution to the covariance is ignored, 
equation (217) reduces to [121,48] 



f sky (i + 1/2) act act 
, , iiin (Cf + cry o Pa dp, 



'a/3- (n*. i nSN\2 a J. ■ ( 218 ) 



Under the approximation that there are a sufficient number of modes in the 
band powers that the distribution of power spectrum estimates is approx- 
imately Gaussian, the Fisher matrix quantifies the best possible errors on 
cosmological parameters that can be achieved by a given survey. In particular 
F^ 1 is the optimal covariance matrix of the parameters and (F _1 )|/ 2 is the 
optimal error on the ith parameter. 

For a cosmological model involving a set of 5 parameters, Q\, normalization of 
the power spectrum, fix = 1 — Q m — ^a, n s and Q m h 2 , Cooray & Hu [55] found 
that non-Gaussianities increase the uncertainties of each of the 5 parameters 
determined from an all-sky experiment down to the 25th magnitude, and as- 
suming all sources at a redshift of ~ 1, by about ~ 10 to 15%. In the case of 
weak lensing, the shot-noise due to finite number of background sources and 
their intrinsic ellipticity becomes the dominant error before the non-Gaussian 
effects dominate over the Gaussian noise. Thus, for the above assumed depth 
and redshift, the non-Gaussian effect on cosmological parameters is some what 
insignificant. For certain planned deeper surveys with better imaging, such as 
planned surveys with Large-Aperture Synoptic Telescope (LSST; [281]), the 
shot-noise term will be subdominant and the non-Gaussian contributions may 
be more important for a precise determination of the cosmological parame- 
ters. As discussed with scaling relations, § 8.6.1, the intrinsic non-Gaussian 
contribution to the onset of non-linearity decreases with increasing survey 
depth, and thus, deeper surveys are in fact preferred over shallow ones for the 
purposes of cosmological lensing work. 



8.7 The Galaxy-Mass Cross- Correlation 



Our description for the galaxy power spectrum, see § 6, allows us to extend 
the discussion to also consider the cross-correlation between galaxies and mass. 
Such a cross-power spectrum can be probed through two independent meth- 
ods: (1) the weak lensing tangential shear-galaxy correlation function and (2) 
the foreground-background source correlation function. As we find later, these 
two correlations probe different scales in the galaxy-mass power spectrum. 
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Fig. 45. The SDSS galaxy-mass cross-correlation using galaxy-shear correlation func- 
tion. We show the halo model prediction with a solid line. The data are from [83]. 

8.8 Shear-Galaxy correlation 



The shear-galaxy correlation function can be constructed by correlating tan- 
gential shear of background galaxies surrounding foreground galaxies. The as- 
sumption is that these foreground galaxies trace the mass distribution along 
the line of sight to background sources. Here, observations involve the mean 
tangential shear due to gravitational lensing which is related to convergence 
through 



where k(9) is the mean convergence within a circular radius of 9 [148,267,99]. 

Since the shear, averaged over a circular aperture, is correlated with foreground 
galaxy positions, one essentially probes the galaxy-mass correlation discussed 
in § 8.7 such that 

k(9) = J drW x ™{r)W^\r) J dkkP^ u {k) 2 ] . (220) 
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Following equation (219), we can write the mean tangential shear involved 
with galaxy-galaxy lensing as 

( 7t (0)) = J drW leJas (r)W^(r) J dkkP g ^ DM (k)J 2 (kd A 9) . (221) 

Here, H^ lens is the lensing window function introduced in equation (179), while 
W gal is the normalized redshift distribution of foreground galaxies. Note that, 
in general, W lens involves the redshift distribution of background sources be- 
yond the simple single source redshift assumption we have considered in prior 
calculations. 

The highest signal-to-noise measurement yet of tangential lensing correlation 
around foreground galaxies comes from the Sloan Digital Sky Survey [83] . We 
compare these measurements with a prediction based on the halo model in 
figure 45. Here, for simplicity, we take the same description for galaxy num- 
ber counts as introduced in § 6, and calculate the galaxy-dark matter correla- 
tion function following equations (134). In calculating the expected correlation 
function, we have used the expected redshift distributions for foreground and 
background galaxies in the Sloan samples. The observed measurements shown 
in figure 45 comes from the Sloan survey for field galaxies; tangential shear 
around a selected sample of 42 foreground galaxy clusters in Sloan data were 
recently presented by [248] . Traditionally, the galaxy-galaxy lensing correlation 
function, similar to the above, was interpreted by a mass and a size distribu- 
tion for foreground galaxies with foreground galaxies generally assumed to be 
distributed randomly. This, or similar approaches, allow constraints on certain 
galaxy properties such as mass and size (see, [83] for details). 

The halo model provides an alternative, and perhaps an improved, description 
consistent with our basic ideas of large scale structure: since galaxies effectively 
trace the dark matter halos and it is the dark matter that is mostly responsible 
for the tangential lensing of background sources, the constraints on mass and 
size effectively applies to halos that galaxies reside in. If field galaxies are 
simply selected as foreground sources, then, the constraint on mass and size 
applies to the dark matter halo of the sample, each of which contains a single 
galaxy. If the foreground sample contains contributions from a wide variety 
of dark matter halo mass scales, then more than one galaxy can reside in 
dark matter halos at the high mass end and a simple interpretation may 
not be possible. Additionally, since halos distribute the large scale structure, 
one should account for the clustering component, i.e.,. the 2-halo term of the 
dark matter-galaxy correlation function, when extracting statistical properties 
related to individual halos that contribute along the line of sight. As shown 
in figure 45, the total halo model prediction, both due to individual halos and 
their clustering, is consistent with observed measurements; the correlation at 
largest angular scales is due to the intrinsic clustering of halos and cannot 
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Fig. 46. The window functions involved with the projection of galaxy-mass power 
spectrum in producing the tangential shear-galaxy correlation (J2) and the fore- 
ground-background galaxy correlation (Jo). Note that the tangential shear-galaxy 
correlation probes smaller physical scales in the galaxy-mass power spectrum and 
are, thus, more sensitive to the non-linear aspect of this correlation function, such 
as the single-halo contribution. 

be simply interpreted as a large extent for the dark matter halos. A more 
thorough study of the weak lensing shear-galaxy cross-correlation, under the 
halo model, is available in [99] and we refer the reader to this paper for further 
details. 



8.9 Foreground-background source correlation 

The second observational probe of the galaxy-mass correlation function comes 
from the clustering of background sources around foreground objects. One 
can construct a power spectrum by simply counting the number of objects, 
such as quasars or X-ray sources, surrounding a sample of foreground sources, 
such as galaxies. The dependence on the correlation comes from the fact that 
foreground sources trace the mass density field which can potentially affect 
the number counts of background sources by the weak lensing effect. 

To understand this correlation, we can consider a sample of background sources 
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whose number counts can be written as 

N(s) = N s~ a (222) 



where s is the flux and a is the slope of number counts^. Due to lensing, when 
the amplification involved is /x, one probes to a lower flux limit s/fi while the 
total number of sources are reduced by another factor /i; the latter results from 
the decrease in volume such that the total surface brightness is conserved in 
lensing. Thus, in the presence of lensing, number counts are changed to 

= jV s~ V -1 . (223) 

In the limit of weak lensing, as more appropriate for the large scale structure, 
fi ~ (1 + 2k) where convergence k was defined in equation (172). This allows 
us to write the fluctuations in background number counts, iV 6 (n) = iV&[l + 
SN b (h)}, in the presence of foreground lensing as [191,192] 



SN b {n) = 2(a- l)«(n) 

ro 

= 2{a - 1) J drW lens {r)5{n, r) , (224) 
o 

where the lensing weight function integrates over the background source pop- 
ulation following equation (179). 

The foreground sources are assumed to trace the density field and based on 
the source clustering, one can write the fluctuations in the foreground source 
population, iV/(n) = Nf[l + 5Nj(h)], as 

ro 

N f (h) = J ' drn f (r)5 g {u,r) (225) 
o 



where rif(r) is the radial distribution of foreground sources. 

We can write the correlation between the foreground and background sources 
as 

2 Similarly, we can describe this calculation with counts based on magnitudes 
instead of flux. In that case, one should replace a with 2.5a m where a m = 
dlogN (m) / 'dm; the logarithmic slope of the magnitude counts 
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w fb (8) = {N f (a)N b (a + e)) 

7 7 kdk 

= 2{a - 1) J n f (r)W lcns (r) j —P gal - DM (k)Jo(kd A 0) , (226) 
o o 

where we have simplified using the Fourier expansion of equations (224) and 
(225), and have introduced the galaxy-mass power spectrum. 

In the case where foreground and background sources are not distinctively 
separated in radial distance, note that there may be an additional correlation 
resulting from the fact that background sources trace the same overlapping 
density field traced by the foreground sources. This leads to a clustering term 
where 

wjr**(6) = (N f ( a )N b ( a + e)) 

7 7 kdk 

= J n f (r)n b (r) j —P g3 i- S ource{k)Jo{kd A 6), (227) 



where P ga i- S ource{k) is now the cross power spectrum between foreground 
source sample, galaxies in this case, and the population of background sources, 
such as quasars. This cross power spectrum can be modeled under the halo 
approach by introducing a relationship between how background sources pop- 
ulate dark matter halos similar to the description for galaxies. This clustering 
component usually becomes a source of contamination for the detection of 
background source-foreground galaxy correlation due to weak lensing alone. 

Note that background- foreground source correlation, equation (226), and the 
tangential shear-foreground galaxy correlation, equation (221), weigh the galaxy- 
mass cross-power spectrum with two different window functions involving a 
Jq and a J 2 , respectively. For a given projected distance d A 9, the two observa- 
tional methods probe the galaxy-mass power spectrum at different scales. As 
shown in figure 46, the tangential shear-foreground galaxy correlation func- 
tion probes the non-linear scales of the galaxy-mass correlation and, thus, more 
sensitive to the behavior of the single-halo contribution than the foreground- 
background correlation function of sources. The dependence of the non-linear 
scales in the shear-galaxy correlation suggests that it is more suitable to probe 
the physical aspects of how foreground galaxies trace their dark matter ha- 
los. On the other hand, the foreground-background source correlation function 
probes the clustering aspects of foreground sources that trace the linear den- 
sity field. 

In figure 47, we show the expected correlation between foreground galaxies 
in the Sloan Digital Sky Survey and background quasars at redshifts greater 
than 1. The expected errors suggest that the correlation will be measured 
out to angular scales of several degrees. Since sufficient statistics will soon 
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Fig. 47. The expected foreground galaxy-background quasar correlation due to 
lensing magnification under several descriptions of the galaxy-mass power spec- 
trum. The expected error bars are for the whole Sloan catalog of galaxies with 
21 < r' < 22, as foreground sources, and Sloan quasars at redshifts greater than 1, 
as background sources. The figure is from R. Scranton (in preparation). 

be available, the catalog can be divided in to redshift bins and be combined 
with associated data on the shear-galaxy correlation for detailed studies on 
galaxy-mass cross clustering. 



9 Halo applications to CMB: Secondary effects 

The angular power spectrum of cosmic microwave background (CMB) temper- 
ature fluctuations is now a well known probe of cosmology. The anisotropics 
can be well described through linear physics involving Compton scattering 
and linearized general relativity. The well known features in the power spec- 
trum, the acoustic oscillations at large angular scales and the damping tail at 
medium angular scales [214,273,261,124], allow the ability to constrain most, 
or certain combinations of, parameters that define the currently favored CDM 
models with a cosmological constant [159,141,22,297,75]. This has led to a wide 
number of experimental attempts with results so far suggesting the evidence 
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for acoustic peaks as expected in models with adiabatic initial conditions and 
a scale invariant power spectrum of fluctuations [187,67,107,101]. 

The small angular scale temperature anisotropies contain a wide-variety of 
information related to the growth and evolution of large scale structure in- 
cluding non-linear aspects of clustering. Such a contribution from the low 
redshifts is partly contrary to the general assumption that CMB fluctua- 
tions are solely described by linear physics at the last scattering at a redshift 
~ 1100. There are two methods by which large scale structure of the local 
universe can modify CMB temperature: gravity and scattering. The gravita- 
tional contributions arise from variations in the frequency of CMB photons 
via gravitational redshifts and blueshifts [229,222,242,52,63,165] and via de- 
flection [26,155,169,43,230,287,89,38,243,120] and time-delay [125] effects on 
CMB photons due to gravitational lensing. In the reionized epoch, with a 
population of free electrons, the CMB photons can also be Compton-scattered 
[206,285,144,68,72,130,123]. 

The large scale structure contributions to CMB, either due to gravity or scat- 
tering, can be modeled using the halo approach and their statistical properties 
can be calculated in detail, similar to the application of the halo models to 
galaxy and weak lensing statistics. Here, we will consider several such sec- 
ondary contributions including the thermal and kinetic Sunyaev-Zel'dovich 
(SZ; [274]) effects, the gravitational lensing modification to CMB, and the 
non- linear contribution to the integrated Sachs- Wolfe effect (ISW; [229]) at 
small angular scales. 

The anisotropy power spectrum at small angular scales has recently become 
the focus of several theoretical and experimental studies. Though there are sev- 
eral upper-limits and an initial detection of anisotropy power at small scales 
[66,118,271,40], a wide-field CMB image is yet to be produced with resolution 
necessary for studies related to secondary effects. To this end, several experi- 
ments are now working towards obtaining such information either from direct 
imaging or interferometric techniques. These experimental attempts include 
the proposed 12 deg. 2 survey by [33] at the combined and expanded BIMA 
and OVRO arrays (CARMA), the Atacama Telescope (ACT; Lyman Page, 
private communication), and the BOLOCAM array on the Caltech Submil- 
limeter Observatory (Andrew Lange, private communication). In the longer 
term, a few thousand sqr. degrees is proposed to be imaged in a few years 
with a wide-field bolometer array at the South Pole Telescope (John Carl- 
strom, private communication) and the Planck surveyor will allow detailed 
studies of certain secondary effects and foreground via multi-frequency all-sky 
maps. 
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9.1 The Thermal SZ effect 



The SZ thermal effect arises from the inverse-Compton scattering of CMB 
photons by hot electrons along the line of sight. This effect has now been 
directly imaged towards massive galaxy clusters (e.g., [33,140]), where tem- 
perature of the scattering medium can reach as high as 10 keV producing 
temperature changes in the CMB of order 1 mK at Rayleigh- Jeans wave- 
lengths. The SZ effect is now well known for its main cosmological applica- 
tion involving measurements of the Hubble constant. The basic idea follows 
from the initial suggestions by Gunn [97] and Silk & White [262]: the SZ 
temperature decrement, AT oc T e n e dl, towards a given cluster can be com- 
bined with thermal Bremsstrahlung X-ray emission, S x oc T^ 2 nldl, towards 
the same cluster to obtain an estimate of the line of sight distance through 
the cluster: L oc S x / AT 2 . This requires a measurement of T e (r) across the 
cluster; the isothermal assumption T e (r) = To is generally employed due to 
limitations on the observational front. A comparison of this distance to the 
projected separation of the cluster across the sky determines the angular diam- 
eter distance to the cluster, independent of cosmological distance ladder (see, 
[181,182,208,94,221,127] for recent H measurements). Through a cosmolog- 
ical model for the distance, one can extract parameters such as the Hubble 
constant and with measurements over a wide range in redshift, values for the 
matter density and the cosmological constant. 

There are several limitations that prohibit a reliable measurement of the Hub- 
ble constant from the combined SZ and X-ray data, at least in the case of a 
single cluster. The usual spherical assumption for clusters are inconsistent 
with observations and can bias the distance measurement at the level of 10% 
to 20% [49,272,220]. The isothermal assumption for electron temperature has 
been shown to be inconsistent with numerically simulated galaxy cluster gas 
distributions, though, this assumption is yet to be tested with observations 
of galaxy clusters. In the case of clusters with significant cooling flows, it is 
clear that a single temperature cannot be used to describe the electron tem- 
perature; this again leads to biases at the few tens of percent level [228,177]. 
Additional contributions at the 10% level and less include, the presence of 
contaminating radio point sources, either in the cluster [47] or background 
sources gravitationally lensed by the cluster potential [170], fluctuations in 
the background anisotropies lensed through the cluster [39] and the peculiar 
velocity contribution to the kinetic SZ effect. Though, in general, these ef- 
fects limit the reliability of the Hubble constant measured towards a single 
cluster, a significant sample of clusters is expected to produce a measurement 
that is within the few percent level. In the case of projection effects involving 
ellipsoidal clusters, distributed following ellipticities observed for present-day 
cluster samples, it can be shown that for a sample of at least 25 or more 
clusters, the mean Hubble constant is consistent with the true value [49]. 
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Fig. 48. Frequency dependence of the SZ effect at a multipole of I ~ 5000. Here, we 
show the absolute value of temperature relative to the thermal CMB spectrum. For 
comparison, we also show the temperature fluctuations due to point sources (both 
radio at low frequencies and far-infrared sources at high frequencies; solid line), 
galactic synchrotron (long dashed line), galactic free- free (dotted line) and galactic 
dust (short dashed line). At small angular scales, frequencies around 50 to 100 GHz 
is ideal for a SZ experiment. 

In the future when wide-field SZ surveys are available, we are more interested 
in the statistics of SZ effect, such as the SZ correlation function or power 
spectrum in real space. Since on top of the SZ effect, one also gets a con- 
tribution from the CMB anisotropy fluctuations, it is clear that one requires 
reliable ways to separate them and also contaminant foregrounds such as ra- 
dio point sources and galactic dust. Due to the nature of inverse-Compton 
scattering, where photons are upscattered from low to high frequencies, the 
SZ effect, fortunately, bears a spectral signature that differs from other tem- 
perature fluctuations including the dominant CMB primary component (see, 
figure 48). In upcoming multifrequency CMB data, thus, the SZ contribution 
can be separated using its frequency dependence. This allows statistics related 
to the SZ effect be studied independently of, say, dominant CMB temperature 
fluctuations. As discussed in detail in [58], a multi-frequency approach can 
easily be applied to Planck surveyor^] missions (see, figure 49). 



3 http://astro.estec.esa.nl/Planck/; also, ESA D/SCI(6)3. 
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Fig. 49. Recovery of the SZ signal with Planck multifrequency data: (a) A line 
of sight integrated model SZ map with the assumption that pressure traces dark 
matter with a scale independent bias at all scales, (b) The map smoothed at 20', 
(c) this SZ signal+noise from primary anisotropics and foregrounds, and (d) final 
recovered map with a SZ frequency spectrum. For Planck, the recovered spectrum 
is consistent with the input spectrum and allows a determination of the SZ power 
spectrum with a cumulative signal-to-noise greater than 100 [58]. 



9.1.1 SZ Power Spectrum 

The temperature decrement along the line of sight due to SZ effect can be 
written as the integral of pressure along the same line of sight 



V = = 9{x) I dra(r)^n e (r)T e (r) (228) 
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where ot is the Thomson cross-section, k& is the Boltzmann's constant, n e is 
the electron number density, r is the comoving distance, and g(x) = xcoth(a;/2) — 
4 with x = hv/ksTcMB is the spectral shape of the SZ effect. At Rayleigh- 
Jeans (RJ) part of the CMB, g(x) = —2. For the rest of this paper, we assume 
observations in the Rayleigh- Jeans regime of the spectrum, though, an exper- 
iment such as Planck with sensitivity beyond the peak of the spectrum can 
separate out these contributions based on the spectral signature, g(x) [58] (see 
also, [277,115] for frequency separation of CMB from foregrounds). 

The SZ power spectrum, bispectrum and trispectrum are defined in the flat 
sky approximation in the usual way 



(y(\ 1 )y(h)) = (27rf6 D (l 12 )Cf z , 
(y(l 1 )y(l 2 ) 2 /(l 3 )> c = (27r) 2 5 D (l 123 ) J B sz (l 1 , 1 2 , 1 3 ) , 
(y(h) . . . y(U)) c = (27r) 2 5 D (l 1234 )T sz (l 1 , 1 2 , 1 3 , 1 4 ) . (229) 

These can be written as a redshift projection of the pressure power spectrum, 
bispectrum and trispectrum, respectively: 



W sz (r) 2 n ( I 





1 dr 




[ dr 




f dr 



di - Pn U' r J' (230) 



d\ n \d.A d A ' d A ,} ) 
W^irT T (h h h h \ mn 

-^ Tn \T A U A 'T A U A - r ) ■ (231) 

Here, d A is the angular diameter distance. At RJ part of the frequency spec- 
trum, the SZ weight function is 

W sz (r) = -2-^% (232) 



where n e is the mean electron density today. In deriving equation (231), we 
have used the Limber approximation [168] by setting k = l/d A and flat-sky ap- 
proximation. Here, we have written the correlations in terms of the large scale 
structure pressure, denoted by n, power spectrum, bispectrum and trispec- 
trum. 

The halo approach has been widely utilized to make analytical predictions on 
the statistics related to SZ thermal effect from the large scale structure such as 
the power spectrum (e.g., [44,162]). Other approaches include a biased descrip- 
tion of the pressure power spectrum with respect to the dark matter density 
field (e.g., [218,58]). These analytical calculations are now fully complemented 
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Fig. 50. The dark matter (NFW) profile and the ones predicted by the hydro- 
static equilibrium for g function of the b parameter (see equation 236) with 
r s = 100. The relative normalization between individual parameters is set using 
a gas fraction value of 0.1, though the NFW profile is arbitrarily normalized with 
p s = 1; the gas profiles scale with the same factor. For comparison, we also show a 
typical example of the so-called (5 model (1 + r 2 /r 2 ) _3/3 / 2 which is generally used 
as a fitting function for X-ray and SZ observations of clusters. We refer the reader 
to [179] and [275] for a detailed comparison of (3 models and the NFW-gas profiles. 

by numerical simulations (e.g., [61,223,246,266]) which are now beginning to 
test the assumptions related to the halo based calculations. So far, compar- 
isons between numerical simulations and the halo approach suggest significant 
agreement better than comparisons involving dark matter alone [224]. We will 
discuss reasons for this below. 

First, we will describe the halo based approach to SZ statistics by introducing 
the clustering of large scale structure pressure. This is similar to the dark 
matter power spectrum and its projection along the line of sight that leads to 
weak lensing convergence power spectrum: the line of sight projections of the 
large scale structure pressure leads to the SZ effect. 



9.1.2 Clustering Properties of Large Scale Structure Pressure 

In order to describe the large scale structure pressure, we make use of the 
hydrostatic equilibrium between the gas and the dark matter distributions 
within halos. The hydrostatic assumption is supported by various observations 
of galaxy clusters, where the existence of regularity relations, such as the 
size-temperature relation [193], between physical properties of dark matter 
and baryon distributions suggest simple physical relations between the two 
properties. 
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The hydrostatic equilibrium for gas with pressure P and density p g 

,dP GM$(r) 
9 ~dr ~ — (233) 



can be simplified in the limit gas is ideal, P = ^^Pg, and isothermal to obtain 

k B T e d\ogp g _ GM s (r) 

J _ 2 ' \ Z64: ) 

pm p dr r z 



where \x = 0.59, corresponding to a hydrogen mass fraction of 76%. Here, now 
the Mg(r) is the dark matter mass only out to a radius of r. Using a NFW 
profile for dark matter distribution, we can analytically calculate the baryon 
density profile p g (r) 

( r \br s /r 
1 + - , (235) 
T s J 



where b is a constant, for a given mass [179,275]: 

b = 4 ^"Wg 
k B T e 



The normalization, p go , can be set to obtain a constant gas mass fraction 
for halos comparable with the universal baryon to dark matter ratio: f g = 
M g /Ms = Q b /Q m . The total gas mass present in a dark matter halo within 
the virial radius, r v , is 

c 

M g (r v ) = 4np g0 e- b r 3 s J dxx 2 (l + x) h ' x . (237) 



The electron temperature can be calculated based on the virial theorem or 
similar arguments as discussed in [50] . Using the virial theorem, we can write 

-fGprripMs 

k B T e = , (238) 

or,, 



V 



with 7 = 3/2. Since r v oc M (5 1/3 (l + ^)~ 1 in physical coordinates, T e cx M 2//3 (l + 
z). The average density weighted temperature is 



(Te) 5 = j dM y^( M > z )Te(M, z) . (239) 



p b dM 
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Fig. 51. The variation in the density weighted temperature of electron as a function 
of redshift. The solid line shows the redshift evolution of the temperature in hydro- 
dynamical simulations while the halo models, with varying halo masses, are shown 
in dotted and dashed lines. The figure is from [224]. 

In figure, we show the evolution of density weighted temperature from [224]. 
The results from numerical simulations are well reproduced with a Press- 
Schechter mass distribution for halos. For the ACDM cosmology, the halo 
model predicts a density weighted temperature for large scale structure elec- 
trons of ~ 0.5 keV today; if halos out to a mass of 8 x 10 14 Mq only included, 
this mean density weighted temperature decreases to 0.41 keV. 

In figure 50, we show the NFW profile for the dark matter and arbitrarily 
normalized gas profiles predicted by the hydrostatic equilibrium and virial 
theorem for several values of b. As b is decreased, such that the temperature 
is increased, the turn over radius of the gas distribution shifts to higher radii. 
As an example, we also show the so-called f3 model that is commonly used to 
describe X-ray and SZ observations of galaxy clusters and for the derivation 
purpose of the Hubble constant by combined SZ/X-ray data. The (3 model 
describes the underlying gas distribution predicted by the gas profile used here 
in equilibrium with the NFW profile, though, we find differences especially 
at the outer most radii of halos. This difference can be used as a way to 
establish the hydrostatic equilibrium of clusters, though, any difference of gas 
distribution at the outer radii should be accounted in the context of possible 
substructure and mergers. 
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A discussion on the comparison between the gas profile used here and the 
/3 model is available in [179] and [275]. In addition, we refer the reader to 
[50] for full detailed discussion on issues related to modeling of pressure power 
spectrum using halo and associated systematic errors. Comparisons of the halo 
model predictions with numerical simulations are available in [246] and [224]. 

As discussed in [161], one can make several improvements to the above gas 
profile. One can constrain the gas distribution such that at outer most radii of 
halos, gas distributions follows that of the dark matter. This can be done by 
setting the slopes of dark matter and gas profiles to be the same beyond some 
radius. If gas is assumed to be in hydrostatic equilibrium, a gas profile that 
traces dark matter produces a temperature profile that varies with redshift. 
In general, one can obtain consistent solutions by assuming a polytropic form 
for pressure, P oc p g T e oc pj. As discussed in [161], predictions based on this 
prescription for cluster gas are more consistent with observations than the 
simple description involving an isothermal electron distribution 

Given a description of the halo electron (or gas) profile and their temperature 
distribution, we can write the power spectrum of large scale structure pressure 

as 



where the two terms represent contributions from two points in a single halo 
(lh) and points in two different halos (2h) respectively. 

Here, we redefine the integral in equation (98) for dark matter to account for 
pressure as 



with the three-dimensional Fourier transform of the gas profile substituted 
in equation (80) to obtain uu(k\m;z). We define the bias and correlation of 
pressure, relative to dark matter, as 



P u (k) = P lh (k) + P 2h (k), 
P lh (k) = Mg(k,k), 

P 2h (k)=\M? 1 (k)] 2 P hn (k), 



(240) 
(241) 

(242) 



M$(k 1 ,...,k j ;z) = J dm 




x [uu(ki\m] z) . . . uu(kj\m; z)\ , 



(243) 




Pu(k) 



(244) 
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Fig. 52. The (a) pressure and (b) pressure-dark matter cross power spectrum today 
broken into individual contributions under the halo description. For comparison, we 
also show the dark matter power spectrum under the halo model and in (a) pressure 
bias and in (b) pressure-dark matter correlation. 

and 

r iS (k)= Pu - S(k) , (245) 
IPn(k)P s (k) 



respectively. Here, P$ is the dark matter power spectrum and Pus is the 
pressure-dark matter cross power spectrum. As presented for dark matter, 
we can similarly extend the derivation to calculate pressure bispectrum and 
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Fig. 53. The baryon density (left) and temperature weighted density, or pressure 
(right), in a time-slice of a hydrodynamical simulation by [246]. As shown, most of 
the contribution to large scale structure pressure comes from massive halos while 
the baryon density is distributed over a wide range of mass scales and trace the 
filamentarity structures defined by the dark matter distribution. The figure is from 
U. Seljak based on simulations by [246]. 

trispectrum. 

In figure 52(a), we show the logarithmic power spectrum of pressure and dark 
matter such that A 2 (k) = k 3 P(k)/2ir 2 with contributions broken down to the 
lh and 2h terms today. As shown, the pressure power spectrum depicts an 
increase in power relative to the dark matter at scales out to few h Mpc -1 , 
and a decrease thereafter. 

The decrease in power at small scales can be understood through the rela- 
tive contribution to pressure as a function of the halo mass. In figure 54, we 
break the total dark matter power spectrum (a) and the total pressure power 
spectrum (b), to a function of mass. As shown, contributions to both dark 
matter and pressure comes from massive halos at large scales and by small 
mass halos at small scales. The pressure power spectrum is such that through 
temperature weighing, with T e oc M 2 / 3 dependence, the contribution from low 
mass halos to pressure is suppressed relative to that from the high mass end. 

In figure 53, we show two images of a time slice through numerical simulations 
by [246]. The gas, or baryon, density distributions is such that it is highly 
filamentary and traces the large scale dark matter distribution. The pressure, 
however, is confined to virialized halos in the intersections between filaments. 
These are the massive clusters in the simulation box: the density weighted 
temperature, or pressure, of large scale structure is clearly dominant in massive 
clusters. Thus, the pressure power spectrum, at all scales of interest, can be 
easily described with halos of mass greater than 10 14 Mq. A comparison of 
the dark matter and pressure power spectra, as a function of mass, in figure 54 
reveals that the turn over in the pressure power spectrum results in an effective 
scale radius for halos with mass greater than 10 14 Mq. We refer the reader to 
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Fig. 54. The mass dependence on the dark matter power spectrum (a) and pressure 
power spectrum (b). Here, we show the total contribution broken in mass limits as 
written on the figure. As shown in (a), the large scale contribution to the dark matter 
power comes from massive halos while small mass halos contribute at small scales. 
For the pressure, in (b), only massive halos above a mass of 10 14 M sun contribute 
to the power. 

[50] for further details on the pressure power spectrum and its properties. 

We can now use the pressure power spectrum to calculate the SZ angular power 
spectrum by projecting it along the line of sight following equation (231). 
In figure 55(a), we show the SZ power spectrum due to baryons present in 
virialized halos. As shown, most of the contributions to SZ power spectrum 
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comes from individual massive halos, while the halo-halo correlations only 
contribute at a level of 10% at large angular scales. This is contrary to, say, the 
lensing convergence power spectrum, where most of the power at large angular 
scales is due to halo-halo correlations. The difference is effectively due to the 
dependence of pressure on most massive halos in the large scale structure and 
to a lesser, but somewhat related, reason that SZ weight function increases 
towards low redshifts. Note that the lensing weight function selectively probes 
the large scale dark matter density power spectrum at comoving distances half 
to that of background sources (z ~ 0.2 to 0.5 when sources are at a redshift 
of 1), but has no extra dependence on mass when compared to the SZ weight 
function. 

The predictions based on halo model are consistent with numerical simula- 
tions. In figure 56, we show the angular power spectrum of SZ effect as mea- 
sured in numerical simulations by [224] and a comparison to the halo calcula- 
tion following [50]. Note that simulations show a slight decrease in signal when 
the total mass included in the calculation is 10 16 h -1 Mq. The measurements 
are best described with a halo mass distribution out to a maximum mass of 
8 x 10 14 h^ 1 Mq, consistent with the expectation that highest mass halos are 
rare and are not present in the simulated box. 

As we discuss later, the kinetic SZ effect has no such dependence on the 
massive halos and contributions to kinetic SZ effect comes from masses over 
a wide range. In figure 57, we show projected maps of the SZ thermal and SZ 
kinetic effect produced in simulations by [266]. The maps clearly show that 
the SZ thermal effect may be a useful way to map the massive structures in 
the universe. 

The fact that the SZ power spectrum results mainly from the single halo term 
also results in a sharp reduction of power when the maximum mass used in 
the calculation is varied. For example, as discussed in [50] and illustrated in 
figure 55(b), with the maximum mass decreased from 10 16 to 10 13 Mq, the 
SZ power spectrum reduced by a factor nearly two orders of magnitude in 
large scales and an order of magnitude at I ~ 10 4 . The same dependence also 
suggests a significant sample variance for the SZ effect as massive halos are 
rare; as discussed in [51], the SZ statistics from small fields are likely to be 
heavily biased based on the mass distribution of halos. The same effect was 
found in numerical simulations where the power spectrum was observed to 
vary over a factor of ~ 2 from 4 deg. 2 field to field over all scales probed [246]. 
For similar reasons, there is also a significant non-Gaussian contribution to the 
covariance of the SZ effect that may complicate the use of SZ power spectrum 
as a probe of cosmology or galaxy cluster physics [51]. 

Following [296], one can calculate the number counts of SZ halos under the 
approximation that gas traces dark matter and that the temperature of elec- 
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trons can be related to velocity dispersion of the halo through virial arguments. 
This allows one to simplify the expected temperature decrement due to the 
SZ effect at RJ wavelengths 



Tqmb 
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p dm (r) (246) 



where the temperature of electrons has been approximated via line of sight 
velocity dispersion of dark matter particles, cr| m . 

The expected number of peaks due to the thermal SZ effect can be evaluated 
by determining the expected SZ flux, integrated over the cluster, as a function 
of mass and then integrating over the mass function: 



where the probability distribution of temperature fluctuations arises from the 
lognormal scatter in the concentration-mass relation [296]. Figure 58 shows 
that the counts predicted by this model are in good agreement with numerical 
simulations. Note, however, that the simulations were of dark matter only, 
so they also assumed that gas traces density. Hydrodynamical simulations 
have been used to test the extent to which gas traces dark matter; they show 
that gas pressure effects can be important at the low mass end. Therefore, 
one expects modifications to Figure 58 at the low mass end; counts based on 
hydrodynamical simulations can be found in e.g., [266,61]. 

9.2 The kinetic SZ effect 

Extending our calculation on the contribution of large scale structure gas 
distribution to CMB anisotropies through SZ effect, we can also study an 
associated effect involving baryons associated with halos in the large scale 
structure. 

The bulk flow of electrons, that scatter CMB photons, lead to temperature 
fluctuations through the well known Doppler effect 



where v is the baryon velocity. In figure 55, we show the general Doppler effect 
due to the velocity field. The power spectrum is such that it peaks around the 
horizon at the scattering event projected on the sky today. On scales smaller 
than the horizon at scattering, the contributions are significantly canceled as 
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photons scatter against the crests and troughs of the perturbation. As a result, 
the Doppler effect is moderately sensitive to how rapidly the universe reionizes 
since contributions from a sharp surface of reionization do not cancel [54]. Also 
important are the double scattering events, which first scatter out of the line 
of sight and the scatter back in, that do not necessarily cancel [144,54]. 

The cancellations can be avoided by modulating the velocity field with electron 
number density fluctuations. This is the so-called Ostriker-Vishniac [206,285] 
effect. The OV effect has been described as the contribution to temperature 
anisotropies due to baryon modulated Doppler effect in the linear regime of 
fluctuations. At non-linear scales, it is well known that the peculiar velocity 
of galaxy clusters, along the line of sight, also lead to a contribution to tem- 
perature anisotropies. This effect is commonly known as the kinetic Sunyaev- 
Zel'dovich effect and arises from the halo modulation of the Doppler effect 
associated with the velocity field [274]. The kinetic SZ effect can be consid- 
ered as the OV effect extended to the non-linear regime of baryon fluctuations 
[119], however, it should be understood that the basic physical mechanism re- 
sponsible for the two effects is the same and that there is no reason to describe 
them as separate contributions. 



9.2.1 Kinetic SZ power spectrum 

The kinetic SZ temperature fluctuations, denoted as kSZ, can be written as a 
product of the line of sight velocity, under linear theory, and density fluctua- 
tions 
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Here, we have used linear theory to write the large scale velocity field in 
terms of the linear dark matter density field. The multiplication between the 
velocity and density fields in real space has been converted to a convolution 
between the two fields in Fourier space. We can now expand the temperature 
perturbation due to the kinetic SZ effect, T kSZ , using spherical harmonics: 
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where we have symmetrized by using ki and k 2 to represent k — k' and k' 
respectively. Using 

n-k^y^r'^W, (251) 



and the Rayleigh expansion (equation 176), we can further simplify and rewrite 
the multipole moments as 
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We can construct the angular power spectrum by considering (a>hmi a i 2m2 ) ■ 
Under the assumption that the temperature field is statistically isotropic, the 
correlation is independent of to, and we can write the angular power spectrum 

as 
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We can separate out the contributions such that the total is made of correla- 
tions following (vgVg)(5 g 5g) and {vgSg){vgS g } depending on whether we consider 
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cumulants by combining ki with or k 2 respectively. After some straightfor- 
ward but tedious algebra, and noting that 
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we can write 
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Here, the first term represents the contribution from (vgV g )(5 g 5g) while the 
second term is the {v g Sg){v g 5g) contribution, respectively. In simplifying the 
integrals involving spherical harmonics, we have made use of the properties 
of Clebsh-Gordon coefficients, in particular, those involving 1 = 1. The inte- 
gral involves two distances and two Fourier modes and is summed over the 
Wigner-3j symbol to obtain the power spectrum. Since we are primary in- 
terested in the contribution at small angular scales here, we can ignore the 
contribution to the kSZ effect involving the correlation between linear den- 
sity field and baryons and only consider the contribution that results from 
baryon-baryon and density- density correlations. In fact, under the halo de- 
scription provided here, there is no correlation of the baryon field within halos 
and the velocity field traced by individual halos (see § 7). Thus, contribution 
to the baryon- velocity correlation only comes from the 2-halo term of the den- 
sity field-baryon correlation. This correlation is suppressed at small scales and 
is not a significant contributor to the kinetic SZ power spectrum [119]. 

Similar to the Limber approximation [168], in order to simplify the calculation 
associated with (v g Vg)(5g5g), we use an equation involving completeness of 
spherical Bessel functions (equation 181) and apply it to the integral over k 2 
to obtain 
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The alternative approach, which has been the calculational method in many 
of the previous papers [285,72,130,68,119], is to use the flat-sky approximation 
with the kinetic SZ power spectrum written as 



^ = ^/*^*<*> a/ - (* = £)• (258) 

with the mode-coupling integral given by 
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(259) 

We refer the reader to [285] and [68] for details on this derivation. In above, \x = 
k-ki, yi = ki/k and y 2 = k 2 /k = \Jl — 2/j/yi + y\. This flat-sky approximation 
makes use of the Limber approximation [168] to further simplify the calculation 
with the replacement of k = I /d,A- The power spectra here represent the 
baryon field power spectrum and the velocity field power spectrum; the former 
assumed to trace the dark matter density field while the latter is generally 
related to the linear dark matter density field through the use of linear theory 
arguments. 

The correspondence between the flat-sky and all-sky formulation can be ob- 
tained by noting that in the small scale limit contributions to the flat-sky 
effect comes when k 2 = |k — k x | ~ k such that y\ 1. In this limit, the flat 
sky Ostriker-Vishniac effect reduces to a simple form given by [119] 

c ^=\S dr ^^ p ^ k)v -- (260) 



Here, v 2 rms is the rms of the uniform bulk velocity form large scales 

vLs = J dk^ . (261) 
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The 1/3 arises from the fact that rms in each component is l/3rd of the total 
velocity. Similarly, one can reduce the all sky expression, equation (257), to 
that of the flat-sky, equation (260), in the small scale limit of I ~ l± >> l 2 , 
with li probing the density field [51]. 

In figure 55, we show our prediction for the SZ kinetic effect and a compari- 
son with the SZ thermal contribution. As shown, the SZ kinetic contribution 
is roughly an order of magnitude smaller than the thermal SZ contribution. 
There is also a more fundamental difference between the two: the SZ thermal 
effect, due to its dependence on highest temperature electrons is more depen- 
dent on the most massive halos in the universe, while the SZ kinetic effect 
arises more clearly due to large scale correlations of the halos that make the 
large scale structure. 

The difference between the two effects arises from that fact that kinetic SZ 
effect is mainly due to the baryons and not the temperature weighted baryons 
that trace the pressure responsible for the thermal effect. Contributions to the 
SZ kinetic effect comes from baryons tracing all scales and down to small mass 
halos. The difference associated with mass dependence between the two effects 
suggests that a wide-field SZ thermal effect map and a wide-field SZ kinetic 
effect map will be different from each other in that massive halos, or clusters, 
will be clearly visible in a SZ thermal map while the large scale structure 
will be more evident in a SZ kinetic effect map. As shown with the thermal 
and kinetic SZ maps in figure 57 from [266], numerical simulations are in fact 
consistent with this picture (see, also [62]). 

As shown in figure 55(b), the variations in maximum mass used in the cal- 
culation does not lead to orders of magnitude changes in the total kinetic SZ 
contribution, which is considerably less than the changes in the total ther- 
mal SZ contribution as a function of maximum mass. This again is consistent 
with our basic result that most contributions come from the large scale linear 
velocity modulated by baryons in halos. Consequently, while the thermal SZ 
effect is dominated by shot-noise contributions, and is heavily affected by the 
sample variance, the same is not true for the kinetic SZ effect. 

In figure 59, we show several additional predictions for the kinetic SZ effect, 
following the discussion in [119]. Due to the density weighting, the kinetic 
SZ effect peaks at small scales: arcminutes for ACDM. For a fully ionized 
universe, contributions are moderately dependent on the optical depth r. Here, 
we assume an optical depth to ionization of 0.05, consistent with current upper 
limits on the reionization redshift from CMB [95] and other observational data 
(see, e.g., [100] and references therein). In figure 59, we have calculated the 
kinetic SZ power spectrum under several assumptions, including the case when 
gas is assumed to trace the non-linear density field and the linear density field. 
We compare predictions based on such assumptions to those calculated using 
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the halo model. As shown, the halo model calculation shows slightly less power 
than when using the non-linear dark matter density field to describe clustering 
of baryons. This difference arises from the fact that baryons do not fully trace 
the dark matter in halos. Due to small differences, one can safely use the 
non-linear dark matter power spectrum to describe baryons. Using the linear 
theory only, however, leads to an underestimate of power by a factor of 3 to 4 
at scales corresponding to multipoles of I ~ 10 4 to 10 5 and may not provide 
an accurate description of the total kinetic SZ effect. 

In addition to the contribution due to the line of sight motion of halos, there 
is an additional effect resulting from halo rotations as discussed by [53]. Here, 
the resulting rotational contribution to kinetic SZ effect was evaluated under 
the assumption that baryons in halos are corotating with dark matter; this 
assumption is primarily due to the lack of knowledge on angular momentum 
of gas in virialized halos from numerical simulations. In terms of the dark 
matter, recent high resolution numerical simulations show that the spatial 
distribution of angular momentum in dark matter halos has a universal profile 
(see e.g. [31,286]). This profile is consistent with that of solid body rotation, 
but saturates at large values for angular momentum. The spatial distribution 
of angular momentum in most halos (80%) tend to be cylindrical and well- 
aligned with the spin of a halo. Also, angular momentum is almost independent 
with the mass of the halo and does not evolve with redshift except after major 
mergers. 

For an individual cluster at a redshift z with an angular diameter distance 
d,A, one can write the temperature fluctuation as an integral of the electron 
density, n e (r), weighted by the rotational velocity component, cur cos a, along 
the line of sight. Introducing the fact the line of sight velocity due to rotation is 
proportional to sine of the inclination angle of the rotational axis with respect 
to the observer, i, we write 



Here, 6 is the line of sight angle relative to the cluster center and is an 
azimuthal angle measured relative to an axis perpendicular to the spin axis 
in the plane of the sky. In simplifying, we have introduced the fact that the 
angle between the rotational velocity and line of sight, a, is such that a = 
cos^ 1 (IaO/v. In equation (263), -R vir is the cluster virial radius. 
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To describe the halo rotations, we write the dimensionless spin parameter 
A(= Jy/E/GM 5 / 2 ) following [31] as 



2V c M vir R yh /(c) 1 J 



where the virial concentration for the NFW profile is c = R v i r /r s , J is the 
total angular momentum, and V c 2 = GM vir /i? vir . In Ref. [31], the probability 
distribution function for A was measured through numerical simulations and 
was found to be well described by a log normal distribution with a mean, A, 
of 0.042 ± 0.006 and a width, cr x of 0.50 ± 0.04. 

To relate angular velocity, u, to spin, we first integrate the NFW profile over 
a cluster to calculate J, and substitute in above to find 

R vir h(c)Jcg(c) 



The functions /(c), g(c) and h(c), in terms of the concentration, follows as 
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In figure 60, we show the temperature fluctuation produced by the rotational 
component for a typical cluster with mass 5 x 10 14 Mq at a redshift of 0.5. 
The maximal effect, with the mean spin parameter measured by [31], is on the 
order of ~ 2.5 /iK. The sharp drop towards the center of the cluster is due to 
the decrease in the rotational velocity. As shown, the effect leads to a distinct 
temperature distribution with a dipole like pattern across clusters. Here, we 
have taken the cluster rotational axis to be aligned perpendicular to the line 
of sight; as it is clear, when the axis is aligned along the line of sight, there is 
no resulting contribution to the SZ kinetic effect through scattering. 

The order of magnitude of this rotational contribution can be understood by 
estimating the rotational velocity where the effect peaks. In equation (265), 
rotational velocity is uj ~ 3AV^/i? vir with functions depending on the concen- 
tration in the order of a few (ss 2.4 when c = 5). Since the circular velocity 
for typical cluster is of order ~ 1500 km s" 1 , with R V1I ~ Mpc and A ~ 0.04, 
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at typical inner radii of order ~ l/5i? v i r , we find velocities of order ~ 30 km 
s -1 . Since, on average, peculiar velocities for clusters are of order ~ 250 km 
s _1 , the rotational velocity is lower by a factor of ~ 8, when compared with 
the peculiar velocity of the typical cluster. Furthermore, since the kinetic SZ 
due to peculiar motion peaks in the center of the halo where the density is 
highest, while the rotational effect peaks away from the center, the difference 
between maximal peculiar kinetic SZ and rotational kinetic SZ temperature 
fluctuations is even greater. Note, however, each individual cluster has a dif- 
ferent orientation and magnitude of peculiar velocity and rotation, thus the 
velocity-to-rotation ratio could vary a lot. In favorable cases where the peculiar 
velocity is aligned mostly across the line of sight, the rotational contribution 
may be important. 

In figure 61, we show the kinetic SZ effect towards the same cluster due to the 
peculiar motion and the contribution resulting from the lensed CMB towards 
the same cluster. The latter contribution is sensitive to the gradient of the 
dark matter potential of the cluster along the large scale CMB gradient. In 
this illustration, we haven taken the CMB gradient to be the rms value with 
13 iiK arcmin -1 following [247]. Previously, it was suggested that the lensed 
CMB contribution can be extracted based on its dipole like signature. Given 
the fact that the rotational contribution also leads to a similar pattern, any 
temperature distribution with a dipole pattern across a cluster cannot easily 
be prescribed to the lensing effect. However, as evident from figures 60 and 61, 
the dipole signature associated with the rotational scattering is limited to the 
inner region of the cluster while the lensing effect, due to its dependence on 
the gradient of the dark halo potential, covers a much larger extent. Also, the 
two dipoles need not lie in the same direction as the background gradient of 
the primary CMB fluctuations and the rotational axis of halos may be aligned 
differently. Thus, to separate the lensed effect and the rotational contribution 
from each other and from dominant kinetic SZ one can consider various fil- 
tering schemes (see, discussion in [247]). In figure 61, we have not included 
the dominant thermal SZ contribution since it can be separated from other 
contributions reliably if multifrequency data are available. 

The interesting experimental possibility here is whether one can obtain a wide- 
field map of the SZ kinetic effect. Since it is now well known that the unique 
spectral dependence of the thermal SZ effect can be used to separate its con- 
tribution [58], it is likely that after such a separation, the SZ kinetic effect will 
be the dominant signal at small angular scales. To separate the SZ thermal 
effect, observations, at multifrequencies, are needed to arcminute scales. Up- 
coming interferometers and similar experiments will allow such studies to be 
eventually carried out. A wide-field kinetic SZ map of the large scale structure 
will allow an understating of the large scale velocity field of baryons, as the 
density fluctuations can be identified through cross-correlation of such a map 
with the thermal SZ map [51]. 
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9.3 Non-Linear Integrated Sachs-Wolfe Effect 

The integrated Sachs- Wolfe effect [229] results from the late time decay of 
gravitational potential fluctuations. The resulting temperature fluctuations in 
the CMB can be written as 

ro 

T ISW (n) = -2 J dr$(r, nr) , (267) 
o 

where the overdot represent the derivative with respect to conformal distance 
(or equivalently look-back time). Writing multipole moments of the tempera- 
ture fluctuation field T(n), 

a lm = J dhT(h)Yr*(n), (268) 
we can formulate the angular power spectrum as 

( a h mi a l2m 2 ) = ^lAmzCh ■ ( 269 ) 

For the ISW effect, multipole moments are 
aS V = < , /^/dr$(k)/ I (fc)lT(k), 

(270) 

with h{k) = f drW w (k, r)ji(kr), and the window function for the ISW effect, 
W 1SW = —2. The angular power spectrum is then given by 

Cf w = - / k 2 dkP^{k) [k{k)] 2 , (271) 

71 J 

where the three-dimensional power spectrum of the time-evolving potential 
fluctuations are defined as 

($(k 1 )$(k 2 )) = (2nf5 D (k 1 + k 2 )P u (h) . (272) 

The above expression for the angular power spectrum can be evaluated ef- 
ficiently under the Limber approximation [168] for sufficiently high I values, 
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usually in the order of few tens, as 
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In order to calculate the power spectrum of time-derivative of potential fluc- 
tuations, we make use of the cosmological Poisson equation in equation (29) 
and write the derivative of the potential through a derivative of the density 
field and the scale factor a. Considering a flat universe with fix = 0, we can 
write the full expression for the power spectrum of time-evolving potential 
fluctuations, as necessary for the ISW effect valid in all regimes of density 
fluctuations, as 
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To calculate the power spectrum involving the correlations between time 
derivatives of density fluctuations, P$g, and the cross-correlation term involv- 
ing the density and time- derivative of the density fields, P S g, we make use of 
the continuity equation in 19, which can be written in the form: 

<j(x, r) = -V • [1 + <S(x, r)] v(x, r) . (275) 



In the linear regime of fluctuations, when <5(x, r) = G(r)6(x,0) <C 1, the 
time derivative is simply <r in (x, r) = —V • v(x, r) leading to the well-known 
result for linear theory velocity field (equation 27). Thus, in linear theory, 
from equation (28), = k 2 P vv (k,r) = G 2 Pgf(k : 0) and P S g = kP 5v (k,r) = 
GGP$(k,0). 



These lead to the well-known results for the linear ISW effect, with a power 
spectrum for <3> as 
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The term within the square bracket is F 2 where F = G/a following derivation 
for the linear ISW effect in [54]. Even though, we have replaced the divergence 
of the velocity field with a time-derivative of the growth function, it should be 
understood that the contributions to the ISW effect comes from the divergence 
of the velocity field and not directly from the density field. Thus, to some 
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extent, even the linear ISW effect reflects statistical properties of the large 
scale structure velocities. 

In the mildly non-linear to fully non-linear regime of fluctuations, the ap- 
proximation in equation (19), involving 5 <C 1, is no longer valid and a full 
calculation of the time-derivative of density perturbations is required. This can 
be achieved in the second order perturbation theory, though, such an approx- 
imation need not be fully applicable as the second order perturbation theory 
fails to describe even the weakly non-linear regime of fluctuations exactly. 
Motivated by applications of the halo approach to large scale structure and 
results from numerical simulations [242,174,258], we consider a description for 
the time-derivative of density fluctuations and rewrite equation (19) as 

<S(x, r) = -V • v(x, r) - V • <J(x, r)v(x, r) , (277) 

where we have separated the momentum term involving p — (1 + 5)v to a 
velocity contribution and a density velocity product. In Fourier space, the 
power spectrum is simply 5(k) = ik ■ p(k) and the power spectrum of 5 can 
be calculated following the halo model description of the momentum-density 
field (§ 7.4). 

In addition to the power spectrum of density derivatives, in equation (274), 
we also require the cross power spectrum between density derivatives and 
density field itself P s $. In § 7.4, using the halo approach as a description of 
the momentum density field, we suggested that the cross-correlation between 
the density field and the momentum field can be well described as 

P pS (k) = ^P pp (k)Pss(k) . (278) 

This is equivalent to the statement that the density and momentum density 
fields are perfectly correlated with a cross-correlation coefficient of 1; this re- 
lation is exact at mildly-linear scales while at deeply non-linear scales this 
perfect cross-correlation requires mass independent peculiar velocity for in- 
dividual halos [258]. Using this observation, we make the assumption that 

Pss ~ \J PssPssi which is generally reproduced under the halo model descrip- 
tion of the cross-correlation between density field and density field derivatives. 
This cross-term leads to a 10% reduction of power at multipoles between 100 
and 1000, when compared to the total when linear and non-linear contribu- 
tions are simply added. 

In figure 62, we show the angular power spectrum of the ISW effect with 
its non-linear extension (which we have labeled RS for Rees-Sciama effect 
[222]). The curve labeled ISW effect is the simple linear theory calculation 
with a power spectrum for potential derivatives given in equation (276). The 
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curves labeled "lin" and "nl" shows the full non-linear calculation following 
the description given in equation (274) and using the linear theory or full non- 
linear power spectrum, in equation (157), for the density field, respectively. 
For the non-linear density field power spectrum, we use the halo approach for 
large scale structure clustering and calculate the power spectrum through a 
distribution of dark matter halos. We use linear theory to describe the velocity 
field in both linear and non-linear cases; since the velocity field only contributes 
as an overall normalization, through v rms , its non-linear effects, usually at high 
k values, are not important due to the shape of the velocity power spectrum. 

As shown in figure 62, the overall correction due to the non-linear ISW effect 
leads roughly two orders of magnitude increase in power at I ~ 1000. The 
difference between linear and non-linear theory density field power spectrum 
in equation (157), only leads to at most an order of magnitude change in 
power. Note that the curve labeled "lin" agrees with previous second order 
perturbation theory calculations of the Rees-Sciama effect [242], while the 
curve labeled "nl" is also consistent with previous estimates based on results 
from numerical simulations. 



10 Summary 

We have presented the halo approach to large scale structure clustering where 
we described the dark matter distribution of the local universe through a 
collection of collapsed and virialized halos. The statistical properties of the 
large scale structure can now be described through properties associated with 
these halos, such as their spatial distribution and the distribution of dark mat- 
ter within these halos. These halo properties are well studied either through 
analytical models or numerical simulations and include such necessary infor- 
mation as the halo mass function, halo bias relative to linear density field and 
the halo dark matter profile. 

The halo approach to clustering essentially allows one to bridge the linear 
regime described by perturbation theories to the non-linear regime described 
by clustering of dark matter within halos. The perturbation theories fail to 
describe the weakly to strongly non-linear regime completely, while, the halo 
model predictions are in better agreement with numerical results based on 
simulations. Though statistically averaged measurements are well produced 
by the halo based calculations, in detail, individual configurations of higher 
order correlations are only produced at the 20% level. The uncertainties here 
are mostly due to assumptions in the current halo model calculations, such as 
the use of spherical halos or ignoring the substructure within halos. 

Though such uncertainties limit the accuracy of halo based calculations, the 
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approach has the advantage that it can be easily extened to describe a wide 
variety of large scale structure properties. In this review, we have discussed 
stastical aspects involving the galaxy distribution, velocities and pressure. In 
order to calculation statistical aspects associated with these physical prop- 
erties, we have introduced simple descriptions involving how they relate to 
dark matter within halos; almost all of these relations are based on numerical 
simulation results. Using these descriptions, we have discussed a wide number 
of applications of the halo model for non-linear clustering including observa- 
tions of the dark matter distribution via weak lensing, galaxy properties via 
wide-field redshift and imaging surveys and applications to upcoming cosmic 
microwave background anisotropy experiments. The halo model has already 
become useful for several purposes, including (1) understand why the galaxy 
clustering essentially produces a power-law correlation function or a power 
spectrum, (2) estimate statistical biases in current and upcoming large scale 
structure weak lensing surveys, and (3) calculate the full covariance matrix 
associated with certain large scale structure observations, such as the angular 
correlation function of galaxies in the Sloan Digital Sky Survey, among others. 
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Fig. 55. The angular power spectra of SZ thermal and kinetic effects. As shown 
in (a), the thermal SZ effect is dominated by individual halos, and thus, by the 
single halo term, while the kinetic effect is dominated by the large scale structure 
correlations depicted by the 2-halo term. In (b), we show the mass dependence of 
the SZ thermal and kinetic effects with a maximum mass of 10 16 and 10 13 Mq. The 
SZ thermal effect is strongly dependent on the maximum mass, while due to large 
scale correlations, kinetic effect is not. 
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Fig. 56. The SZ power spectrum based on numerical simulations and the analytical 
calculations based on the halo model. The simulations are consistent with the mass 
distribution of halos in the simulated box. The decrease in power at largest scales is 
due to the lack of most massive halos, which are rare. The simulations are in good 
agreement with the halo based calculations. The figure is from [224]. 





Fig. 57. Line of sight projected maps of the thermal (left) and kinetic (right) SZ 
effects. The maps are 1° on a side and cover the same field of view. Note that the 
thermal SZ map picks out massive halos while contributions to kinetic SZ effect 
comes from wide range of masses. Unlike thermal SZ, which produces a negative 
decrement at Ray leigh- Jeans wavelengths, the kinetic SZ effect oscillates from neg- 
ative and positive values depends on the direction of the velocity field. Here, struc- 
tures in red are moving towards the observer while those in blue are moving away. 
This figure is from [266]. 
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Fig. 58. Distribution of peak heights or number counts of thermal SZ temperature 
decrements in simulations (solid histograms) and in analytical calculations (solid 
curves). Dashed lines show the contributions to the total from halos with mass 
in the range 10 13 — 10 , 10 14 — 10 15 and above 10 15 from increasing temperature 
decrement values. Here tsz = AT sz /Tcmb- The figure is from [296]. 
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Fig. 59. The temperature fluctuation power (AT, 2 = 1(1 + 1) / (2ir)CiT£ MB ) for a 
variety of methods to calculate the kinetic SZ effect. Here, we show the contribution 
for a reionization redshift of ~ 8 and an optical depth to reionization of 0.05. The 
contributions are calculated under the assumption that the baryon field traces the 
non- linear dark matter (P g (k) = Ps{k) with P$(k) predicted by the halo model), 
the linear density field (P g (k) = P lm (fe)), and the halo model for gas, with total 
and the 2-halo contributions shown separately. For the most part, the kinetic SZ 
effect can be described using linear theory, and the non-linearities only increase the 
temperature fluctuation power by a factor of a few at I ~ 10 5 . 
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Fig. 60. Contribution to temperature fluctuations through halo rotation for a cluster 
of mass 5 x 10 14 Mq at a redshift of 0.5. The temperature fluctuations produce 
a distinct bipolar-like pattern on the sky with a maximum of ~ 2.5 /uK. Here, 
rotational axis is perpendicular to the line of sight and x and y coordinates are in 
terms of the scale radius of the cluster, based on the NFW profile. 
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Fig. 61. Temperature fluctuations due to galaxy clusters: (a) kinetic SZ effect in- 
volving peculiar motion, (b) lensing of CMB primary temperature fluctuations, and 
(c) the total contribution from kinetic SZ, lensing and rotational velocity. The total 
contribution leads asymmetric bipolar pattern with a sharp rise towards the center. 
We have not included the thermal SZ effect as its contribution can be separated 
from these effects, and primary temperature fluctuations, based on its frequency 
dependence. We use the same cluster as shown in figure 60. 
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Fig. 62. The angular power spectrum of the full ISW effect, including non-linear 
contribution. The contribution called Rees-Sciama (RS) shows the non-linear ex- 
tension, though for the total contribution, the cross term between the momentum 
field and the density field leads to a slight suppression between I of 100 and 1000. 
The curve labeled "nl" is the full non-linear contribution while the curve labeled 
"lin" is the contribution resulting from the momentum field under the second order 
perturbation theory. 
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